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described.  The  process  of  employing  the  MacPitts  silicon 
compiler  to  design  an  8-bit  pipelined  digital  multiplier  is 
presented,  and  the  resulting  design  is  evaluated.  The 
process  of  installing  and  debugging  the  HacPitts  Compiler 
and  the  Caesar  VLSI  graphics  editor  on  the  VAX-11/780 
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I.  IHIBODDCTIOH 


1.  B1CRGBC0BD 

The  initial  work  done  on  the  design  of  very  large  scale 
integrated  circuits  (VLSI)  at  the  Naval  Postgraduate  School 
(NPS)  used  a  set  of  software  tools  which  require  designer 
interaction  at  all  levels  of  the  design  process.  These 
tools  and  their  use  is  described  in  a  recent  thesis  by 
Conradi  and  Hauenstein  £Bef.  1  ]. 

Their  design  approach  centers  around  the  use  of:  (1) 
machine-genera ted  programmable  logic  arrays  (PLA*s)  speci¬ 
fied  in  a  language  which  translates  boolean  equations  into 
circuit  layouts,  and  (2)  a  library  of  standard  cell  layouts 
froa  which  other  reguired  circuit  primitives  are  selected. 
The  designer  arranges  the  PLA's  and  standard  cells  cn  a 
"floorplan"  designed  by  heuristic  methods,  and  interconnects 
them  with  a  network  of  individual  wires  devised  by  the 
designer  and  encoded  as  a  "wirelist. "  The  floorplan  layout 
and  addition  of  interconnecting  wires  must  be  done  manually, 
typically  on  graph  paper  at  the  drawing  board.  The  results 
are  manually  encoded  in  an  input  file  format  readable  by  a 
layout  language  program  ("ell"  in  the  case  of  the  cited 
research)  which  merges  the  designer's  floorplan  and  wirelist 
with:  (1)  the  selected  library  cell  layout  descriptions  and 
(2)  the  E1A  layout  descriptions  produced  by  the  separate  FLA 
generation  program.  The  circuit  layout  program  then 
produces  a  description  of  the  total  design  in  another  stan¬ 
dard  file  interchange  format,  the  Caltech  Intermediate  Fora, 
(CIF)  described  by  Head  and  Conway  [Bef.  2:  pp.  115-127]. 
The  CIF  file  can  then  be  used  as  a  source  for  extracting 
design  validation  information,  as  well  for  producing  the 
photographic  masks  used  for  circuit  fabrication. 


The  design  process  outlined  has  the  advantage  of  giving 
the  designer  thorough  control  over  the  architecture  of  the 
circuit.  The  husan  ability  to  evaluate  alternatives,  recog¬ 
nize  patterns  and  grasp  coaplex  nulti-diaensional  relation¬ 
ships  between  individual  elesents  and  the  whole  design 
exceeds  that  of  any  current  aachine  algoritha. 

Gn  the  ether  hand,  this  process  absorbs  large  aacunts  of 
the  designer's  tine  in  perforaing  the  drudgery  of  planning 
and  encoding  the  layout  details.  There  are  at  least  four 
things  wrong  with  involving  the  designer  at  this  level: 

(1)  It  is  repetitious  work,  and  therefore  error-prone. 

(2)  It  is  slow.  (Southard  [Bef.  3]  and  others  have  noted 
that  design  costs  far  outweigh  production  costs  for  custom 
VLSI*) 

(3)  Preoccupation  with  aechanical  details  restricts  a 
designer's  freedom  to  explore  high-level  architectural 
issues  such  as  bus  structure,  degree  of  pipelining,  and 
speed-complexity  tradeoffs. 

(4)  Hajcr  aodif ications  to  the  layout  are  very  expensive  to 
sake  if  they  come  late  in  the  design  cycle,  i.e.  after  cell 
interconnection. 

B.  C0SBE1T  BBS  BIRCH  GOALS 

Pith  this  background  for  motivation,  it  was  decided  to 
investigate  additional  VLSI  computer-aided  design  tools 
which  would  reduce  tine- to- design,  minimize  the  occurrence 
of  huaan  error  in  layout,  and  make  it  possible  to  explore 
design  alternatives  with  greater  ease. 

The  aajor  tool  available  in  the  VLSI  research  coaaunity 
for  this  purpose  is  HacPitts.  HacPitts  (the  naae  is  derived 
from  twe  early  researchers,  HcCulloch  and  Pitts  who  studied 
neurological  systeas  from  a  aatheaatical  and  logic  stand¬ 
point)  is  a  silicon  coapiler  developed  at  the  Hassachusetts 


Institute  of  Technology's  Lincoln  Laboratories  in  1981-1982 
££ef.  4].  A  silicon  compiler,  according  to  one  recent  defi¬ 
nition  [Bef.  5]  which  captures  current  usage  of  this  often 
misunderstood  tern,  is  "a  progran  t.-<it,  given  a  description 
of  what  a  circuit  is  supposed  to  do,  will  produce  a  chip 
layout  that  implements  that  function  in  silicon."  There  is 
enough  latitude  to  allow  f undasentally  different  approaches 
to  silicon  coapilaticn  to  coexist  under  this  definition,  a3 
will  he  deaenstrated  in  the  following  chapter.  In  any  case, 
however,  the  tera  coapiler  is  apt.  Like  software  coapilers, 
these  prograas  take  high-level  source  code  descriptions 
which  are  huaan-readatle  (and  perhaps,  but  not  necessarily, 
algorithmic)  and  "convert"  then  into  low-level  object  code 
(a  CIP  file)  which  is  directly  readable  by  a  machine.  In 
the  case  of  a  silicon  coapiler,  however,  the  aachine  is  not 
a  general- purpose  coaputer,  but  a  photo-resist  Bask  gener¬ 
ator  at  a  silicon  foundry  facility  that  fabricates  inte¬ 
grated  circuits. 

Another  function  that  the  most  advanced  silicon 
coapilers  perform  is  resource  allocation.  Software 
coapilers  free  the  programmer  from  making  decisions  on  where 
in  available  memory  space  to  store  a  particular  aachine  code 
word.  Silicon  compilers,  at  their  best,  free  the  designer 
from  deciding  where  cn  available  silicon  area  to  place  a 
particular  circuit  element.  Besource  allocation  is  a  one- 
dimensional  job  in  software  compilers,  but  a  two-dimensional 
job  in  silicon  compilers.  The  constraints  on  efficient 
resource  allocation  in  silicon  are  severe — compactness  is 
almost  always  one  gcal,  as  is  speed  of  operation  (minimum 
propagation  delay.)  In  memory  allocation,  compactness  is  not 
essential,  unless  one  is  using  a  sequential  access  memory. 

Installation  of  BacPitts  on  the  BPS  VAX- 11/780  computer 
facility  was  expected  to  be  a  "turn-key"  operation.  This 
was  in  fact  not  the  case.  A  large  amount  of  effort  was 


spent  in  researching  and  perforaing  the  aodif ications  tc  the 
host  computer  environment  which  enable  it  to  ran  the 
HacPitts  system,  as  well  as  in  troubleshooting  the  distrib¬ 
uted  HacPitts  source  code  itself.  The  installation  process 
is  described  in  Appendix  A. 

HacPitts  has  no  progressive  breakpoint  facilities  to 
allow  a  designer  freedoa  to  observe  or  alter  the  layout 
process  at  any  point  during  execution.  Once  invoked , 
HacPitts  produces  a  final  interconnected  layout,  coaplete 
with  bending  pads,  or  no  layout  at  all.  Therefore,  it  was 
considered  worthwhile  to  iapleaent  the  color  graphics 
editor,  Caesar,  designed  by  John  Ousterhout  at  the 
University  cf  California  at  Berkeley  [fief.  6].  This  tool 
allows  the  chip  layout  to  be  exaained  in  detail  on  a  color 
CBT  acnitor,  and  peraits  editing  of  the  layout.  Caesar 
represents  the  layout  internally  as  a  hierarchy  of  cells, 
which  yields  insight  into  the  ways  that  HacPitts  partitions 
the  layout  process. 

The  installation  of  Caesar,  while  not  as  difficult  as 
HacPitts,  involved  setting  some  site- depen dent  parameters  as 
well  as  finding  and  correcting  a  bug  in  the  distributed 
source  code.  These  activities  are  described  in  Appendix  B. 
Appendix  C  contains  a  copy  of  the  on-line  manual  pages  for 
Caesar  and  ether  Berkeley  tools  used  in  this  research. 


1« 


II.  APPROACHES  £g  SIIICOI  COHPI1ATIOH 


A.  TISI  DESIGI  ACTIVITIES  DOHA1B 

When  trying  to  understand  how  silicon  coapilers  work  it 
is  instructive  to  think  of  two  design  problens  in  the  order 
in  which  they  aust  be  attacked.  The  first  is  translation  of 
a  brief  behavioral  or  functional  description  into  a  acre 
precise  interaediate  description  that  is  still  independent 
of  the  specific  iapleaentation  technology.  The  second  is 
the  autcaatic  generation  of  a  chip  layout  in  a  target  seai- 
conductor  medium,  using  the  interaediate  description  as  a 
guide.  It  is  iaportant  to  separate  the  second  activity  froa 
the  first  vhen  one  is  designing  a  silicon  compiler  because 
of  the  speed  at  which  the  target  seaiconductor  technologies 
are  evolving.  That  is,  coapleaentary  metal  oxide  semicon¬ 
ductor  (CHOS)  processes  are  rapidly  overtaking  N-channel 
aetal  oxide  seaiconductor  (MHOS)  processes.  Bultiple-layer 
aetalization  is  also  becoaing  more  common,  and  ainiaum 
circuit  feature  sizes  are  shrinking  as  better  control  over 
the  aanufacturing  processes  is  achieved.  Computer  architec¬ 
tures  and  functions  evolve  aore  slowly,  by  coaparison. 

These  two  problems  aay  be  further  subdivided.  Werner 
[Ref.  7]  has  contributed  the  idea  that  a  spectrua  of  VISI 
design  activities  exists  with  corresponding  aedia  for  the 
exchange  of  infcraation  by  the  coaputer-aided  design  tools 
employed  at  each  band  in  the  spectrum.  (See  figure  2.1.) 
Silicon  coapilers  try  to  span  the  whole  spectrua,  an  amti- 
tious  undertaking. 
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It  should  be  recognized  that  all  silicon  compilers 
designed  to  date  have  to  sole  extent  traded  performance  of 
the  ultimate  VLSI  design  (as  measured  by  operating  speed  and 
area  efficiency)  for  reduced  design  time  for  the  chip  (and 
for  the  silicon  compiler  itself.)  Gross  £Bef.  8]  quotes 
estimates  for  reduced  design  costs  (time)  by  use  of  broad 
spectrum  silicon  compilers  to  be  a  factor  of  20.  But 
lallich,  in  a  recent  survey  of  silicon  compiler  efforts 
£Ref.  5],  states  that  designs  produced  by  silicon  compilers 
available  today  tend  to  range  from  15  to  200  percent  larger 
than  equivalent  hand-crafted  designs. 

Still,  silicon  compilers  have  been  misunderstood  by 
researchers  as  noted  by  Gross.  Some,  without  fully  under¬ 
standing  the  dimensionality  of  the  VLSI  design  process, 
believe  that  the  design  problem  can  be  almost  completely 
solved  by  the  application  cf  current  software  methods  and 
tools.  Others,  seeing  the  obvious  limitations  of  contempo¬ 
rary  silicon  compilers  and  not  grasping  the  potential 
contributions  to  VlSI  from  computer  science  technology 
transfer,  believe  that  efficient  VLSI  designs  will  always  be 
essentially  manual.  Hurphy  of  Bell  Laboratories,  quoted  by 
Berner  £Bef.  7],  states  that  "total  automation  is 
inappropriate — either  now  or  in  the  foreseeable  future— in 
anything  where  you  have  a  competitive  need  for  performance." 
Bevertheless,  Bell  labs  is  conducting  research  of  its  own 
into  silicon  compilers.  Their  "Plex"  project  reported  in  a 
more  recent  paper  £Bef.  9]  produces  layouts  of  micrcccm- 
puters  given,  as  input,  the  program  (in  assembly  or  C 
language)  that  the  microcomputer  is  to  execute. 

According  to  Hallich,  the  ultimate  silicon  compiler,  now 
just  a  dream,  will  not  only  be  able  to  take  a  behavioral 
description  and  produce  a  geometrical  description  of  the 


chip  suitable  for  input  to  a  aask  Baking  Machine,  bat  will 
do  sc  for  an?  kind  of  chip — Microprocessor,  signal 
processor,  or  eren  analog-digital  hybrid  for  which  the 
design  rales  are  far  more  conplex.  The  subtle  process  of 
architectural  optiaization  (i.  e.  selecting  a  best  floor 
plan  frea  the  ayriad  possibilities,)  which  occurs  in  the 
aiddle  of  the  design  activities  spectrua,  has  so  far  not 
been  captured  in  an  algor itha.  To  achieve  some  breadth 
without  being  overvhelaed  by  coaplexity,  silicon  coapilers 
have  tended  to  contain  built-in  assuaptions  about  a  "target 
architecture."  They  are  optiaized  for  producing  a  certain 
class  of  circuits— aostly  Microprocessors — and  produce 
layouts  cf  reasonable  area  and  speed  only  for  applications 
best  suited  to  their  target  architecture. 

C.  LIMITED  SPBCTBOB  COBPILEES  (TB1HS1AT0BS) 

For  coapleteness,  it  is  necessary  to  Mention  these  VLSI 
design  tools  in  current  use  which  fall  short  of  covering  the 
design  spectrua.  They  are: 

•  Bandca  logic/Standard-cell  place-and-route  systeas, 

•  Bodule  coapilers  to  iapleaent  boolean  logic,  including: 

•  Gate  array  coapilers, 

•  FLA  generators, 

•  Regular  expression  coapilers  for 

finite-state  aachines, 

•  Layout  Languages, 

•  Interactive  graphical  layout  editors. 
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a.  Coaaon  Properties 

The  first  broad  spectral  translators  of  interest 
are  the  floor  planners.  The;  all  eaploy  a  structural  sped** 
ficaticn  language  in  which  the  specification  always  corre¬ 
sponds  extreaely  closely  to  a  description  of  the  designer's 
aental  aodel  of  how  the  chip  should  be  laid  out.  They 
produce,  as  an  initial  output,  a  skeleton  of  the  layout 
siailar  to  an  architect's  floor  plan.  Subsequently,  flcor 
planners  fill  the  "rooas"  with  cells  fros  a  standard 
library.  Soae  floor  planners,  of  which  Johannsen'S  Bristle 
Blocks  is  a  pioneering  exaaple  £Bef.  10],  can  linearly 
stretch  cells  to  aatch  up  the  interconnections  of  abutting 
cells  (so-called  "pitch  latching.") 

b.  F.I  .B.  S.  1. 

The  current  state  of  the  art  in  floor  planners 
is  represented  by  the  F.I.B.S.T.  (Fast  Ispleaentation  of 
Beal-Tiae  Signal  Transforss)  silicon  coapiler  developed  at 
Edinburgh  University  [Bef.  11].  The  F.I.B.S.  T.  coapiler 
produces  layouts  of  digital  signal  processing  systeas  iaple- 
aented  as  hard-wired  networks  of  pipelined  bit-serial  opera¬ 
tors.  The  floor  plan  of  F.I.B.S.T.  chips  (see  figure  2.2) 
consists  of  a  central  wiring  channel  with  operators  arranged 
as  function  blocks  around  the  "waterfront."  Each  bit-serial 
operator  is  iapleaented  as  a  separate  function  block  which 
in  turn  is  asseabled  froa  a  library  of  hand- designed  cells. 
The  function  blocks  are  arranged,  in  the  order  of  their 
high-level  specification  by  the  designer,  in  two  rows  along 
either  side  of  the  wiring  channel  which  accoaaodates  all 
interconnections  between  the  blocks.  This  uncoaplicated  and 
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novel  layout  methodology  results  in  the  non-use  of  about  20% 
of  the  total  chip  area  (because  the  blocks  may  have  varied 
heights.)  At  present,  F.I.B.S.T.  supports  only  the 
H-channel  metal  oxide  semiconductor  (NMOS)  technology. 

The  F.I.B.S.T.  software  consists  of  a  small 
suite  cf  programs  which  provides  the  designer  with  a 
complete  specialized  design  environment.  At  the  top  level 
is  a  language  compiler  that  accepts  a  structural  description 
of  the  circuit  in  teems  of  a  net  list  of  bit-serial  opera¬ 
tors.  The  F.I.B.S.T.  system  contains  a  library  of  primitive 
operators,  (such  as  HOITIPLY,  ADD,  SOBT,  BIT  DELAY,  ETC.)  as 
well  as  a  number  of  more  complex  procedural  definitiens 
(such  as  Biguad,  Lattice,  Butterfly,  etc.)  that  enable  a 
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range  of  signal  processing  architectures.  The  language 
coapiler  produces  an  interaediate  level  foraat  file  as 
output.  This  file  is  used  by  both  a  layout  program,  which 
produces  the  Bask  geometry,  and  a  simulator.  The  simulator 
is  event  driven,  which  aeans  that  the  voltage  values  on 
circuit  nodes  are  aodeled  as  discrete  bits  of  data  occurring 
at  discrete  tine  intervals.  The  functioning  of  individual 
operators  is  siaulated  on  a  word-by-word  basis  in  response 
to  a  file  of  input  ccaaands.  It  is  asserted  that  the  siau- 
lator  has  the  ability  to  uncover  tiaing  bugs  in  the  data 
streaa. 

A  unique  and  useful  aspect  of  F.I.B.S.T.  is 
incorporation  of  a  translator  program  to  convert  the  simula¬ 
tor*  s  output  into  a  fora  suitable  for  use  with  an  autoaatic 
test  pattern  generator  systea. 

2.  fejiayjgyaX  gpe^isatign  Compilers 
a.  Co an on  Pxcperties 

In  contrast  to  the  floor  planners,  which  accept 
structural  specifications  at  the  top  level,  are  the  behav¬ 
ioral  specification  compilers,  which  do  not  require  the 
designer  to  possess  a  prior  mental  model  of  the  architecture 
to  be  designed.  These  systems  attempt  to  translate  a  high- 
level  behavioral  description  of  the  circuit  into  a  geometric 
aask  description.  This  step  is  a  significant  one  over  floor 
planners. 

t.  Ayres*  Berk 


Ayres  is  the  first  to  have  written  a  book-length 
treataent  of  silicon  coapilation  [Ref.  12].  Ayres*  coapiler 
approach  starts  with  a  synchronous  logic  specification  of 
the  chip  behavior.  Then  follows  a  decomposition  of  this 
specification  repeatedly  into  a  hierarchy  of  implementing 


IflOS  Ill's  which  bsccae  successively  sore  area-eff iciest  as 
they  becoss  smaller.  The  systea  includes  heuristics  to 
aanagc  and  optiaize  cn-chip  routing  aaong  the  PL&'s  gener¬ 
ated.  lyres'  coapiler  is  potentially  applicable  to  a 
broader  class  of  circuits  than  F.I.B. S. T.,  but  is  still  not 
efficient  fer  a  general  range  of  problens.  The  scope  of 
applications  was  restricted  intentionally  to  control 
coaplezity.  The  eery  use  of  PLl's  as  the  sole  basic 
buildieg  blcdf  restricts  the  area  efficiency  of  this  systea. 
Even  though  the  PLl's  themselves  becone  aore  area-efficient 
as  they  becone  saaller,  the  difficulty  of  aanaging  their 
interconnections  liaits  the  ultimate.  efficiency  of  the 
layout. 

c.  HacPitts 

HacPitts  is  the  only  broad  spectrun  silicon 
coapiler  with  which  this  author  has  had  any  first-hand 
experience.  It  is  also  the  aost  widely  known  and  aost  ambi¬ 
tious  behavioral  specification  compiler  in  operation. 

The  hardware  specification  generated  by  HacPitts 
is  in  the  fern  of  an  1H0S  technology  CIF  file.  To  cope  with 
the  ccnplexity  of  this  project  the  designers  restricted  the 
target  architectures  to  aicroprocessors  consisting  of  a  data 
path  and  a  controller  (see  figure  2.3.)  Other  restrictions 
include  fixing  the  width  of  the  data  path  to  one  value 
throughout  the  design,  and  requiring  the  designer  to  specify 
control  and  parallelisa  explicitly.  The  latter  is  not  actu¬ 
ally  a  restriction  in  one  sense,  however,  because  it  affords 
greater  generality  in  designs.  Except  for  making  pin 
assignaents,  the  HacPitts  user  has  no  explicit  control  over 
the  floor  plan  of  his  design.  The  HacPitts  target  architec¬ 
ture  results  in  the  sane  basic  floor  plan  for  all  designs, 
although  this  particular  architecture  is  applicable  to  a 
greater  variety  of  digital  problems  than  any  other  scheme 
presently  available. 
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Piguxe  2.3  Floor  Plan  of  the  BacPitts  Target  Architecture. 


The  data  path  portion  of  the  layout  consists  of 
a  rectangular  array  of  units  called  "organelles."  An  orga¬ 
nelle  is  a  tit- vise  functional  unit.  A  standard  library  of 
functions— adder,  subtracter,  shifters,  increaenters,  ccnpa- 
xatoxs,  etc.  — is  provided.  Also,  if  the  algorithmic 


behavior  specification  calls  for  conditional  data  flew  or 
looping#  the  data  path  nay  also  include  multiplexers  which 
have  connections  for  control  signals.  This  multiplexer 
organelle  is  not  a  library  cell  but  is  built  into  HacPitts. 
Data  storage  registers#  iapleaented  as  master-slave  flip- 
flops#  are  also  "built-in  organelles."  These  are  instanti¬ 
ated  in  the  data  path  if  their  use  is  implied  by  the 
algorithaic  specification. 

The  vertical  diiension  of  the  data  path  outline 
in  figure  2.3  corresponds  to  the  nuaber  of  bits  in  the  data 
word.  longer  word-lengths  produce  a  taller  chip.  The 
various  crganelles  are  cascaded  along  the  horizontal  dimen¬ 
sion  of  the  data  path  outline. 

The  control  portion  of  the  layout  acts  on 
various  signals#  either  derived  froa  the  data  path  or 
outside  the  chip#  and  implements  whatever  boolean  logic  is 
necessary  (as  inferred  froa  the  algorithaic  specification) 
to  generate  controls  signals  to  drive  the  multiplexers  in 
the  data  path.  The  result  is  an  implementation  of  a  finite 
state  machine#  (FSB)  as  described  in  Bead  and  Conway 
[Bef.  2].  The  control  unit  does  not  use  PLA's#  but  rather 
structural  HOB  gate  arrays  called  "Weinberger  Arrays"  which 
can  iiplement  arbitrary  coabinational  logic  functions. 
Weinberger  [Hef.  15]  demonstrates  that  his  logic  arrays  have 
three  features  which  contribute  to  efficiency  in  an  auto¬ 
mated  circuit  layout  scheme. 

•  They  simplify  the  formation  of  interconnection  patterns 
within  the  framework  of  a  standardized  layout. 

•  They  significantly  reduce  the  required  area  (by  elimi¬ 
nating  unused  inputs  and  separate  interconnection 
areas.) 

•  They  eliminate  crossing  of  signal  nets  (by  using  single 
level  wiring.) 


State  tiling  is  controlled  not  by  a  two-phase 
son- ewer lap ping  clock/  which  is  somewhat  standard  in  NMOS 
VLSI/  hat  by  a  three-phase  clock  which  drives  the  register 
circuit  shown  in  figure  2.4.  This  clocking  scheme  appar¬ 
ently  allows  a  more  compact  layout  of  the  register  erga- 
nelle,  but  requires  an  extra  pin  in  the  package. 


Figure  2.4  flacPitts  Register  Circait  and  Timing  Diagram. 
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One  of  the  authors  of  HacPitts,  Siskind  guoted 
ini  [  Bef .  7],  admits  that  optimizing  chip  performance  vas  not 
a  primary  design  goal.  circuit  densities  reported  were 
80-100  transistors  per  square  millimeter  in  5  micron  feature 
size  S80S — approximately  2  orders  of  magnitude  lover  than 
the  state  cf  the  art  layouts  reported  in  Gross  [fief.  8]. 
Southard  contends  that  the  cells  he  helped  design  for 
HacPitts  could  fairly  easily  have  been  made  20  per  cent 
smaller  than  they  are  [Bef.  5]. 

HacPitts  only  produces  NHOS  output  in  CIF,  but 
the  user  has  a  choice  of  either  4  or  5  micron  minimum 
feature  size,  which  the  compiler  handles  by  linearly  scaling 
all  features  except  the  pads.  The  latter  are  contained  in 
two  separate  libraries  for  4  micron  and  5  micron  designs. 

From  the  programming  viewpoint,  HacPitts  is  a 
very  complex  system.  It  consists  of  a  binary  executable 
module  of  over  1.5  megabytes  which  was  built  up  as  a  LISP 
programming  environment  and  then  dumped,  as  described  in  the 
franz  Lisp  manual  [Bef.  13].  &  synopsis  of  the  functional 
elements  which  make  up  this  LISP  environment  is  shown  in 
figure  2.5  .  Unlike  F.I.fi.S.  T. ,  these  programs  (except  the 
functional  simulator  or  "interpreter"  as  its  authors  call 
it)  are  not  individually  accessible.  HacPitts  runs  automat¬ 
ically  from  beginning  to  end  with  no  possibility  for  oper¬ 
ator  intervention.  The  only  control  available  at  the 
console  when  the  compiler  is  running  is  the  standard  UNIX 
system  abort  signal. 

The  authors  of  HacPitts  were  careful  to  separate 
all  the  processing  into  technology  independent  (f rent-end) 
and  technology  dependent  (back-end)  portions,  with  the 
intermediate-level  description  being  the  point  of  division. 
This  intermediate-level  description  is  available  to  the  user 
as  an  "object  file"  in  human  readable  form.  It  is  possible, 
although  net  very  practical,  to  write  an  object  file 


Figure  2.5  HacPitts  Program  Data  FIov. 


directly  for  input  to  the  back  end  of  MacPitts.  The  object 
file  is  a  long  list  containing  5  elements,  each  elenent 
being  itself  a  list.  The  5  elements  are:  definitions, 
flags,  data  path,  control,  and  pins.  This  list  is,  of 
coarse,  in  a  fora  readable  by  the  layout  programs. 

The  layout  programs  produce  only  NHOS  tech¬ 
nology.  is  mentioned  above,  tvo  bonding  pad  libraries  are 
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included:  the  Stanford  standard  cell  library  pads  for  5 
■icrcn  designs,  and  the  HOSIS  ARPA  community  pads  for  4 
micron  designs.  The  "layout  language"  and  CIF  generation 
program,  15,  vhich  is  embedded  in  RacPitts,  vas  written 
especially  for  the  project  by  Crouch  £Re£.  14].  It  has 
built-in  facilities  to  handle  both  NHOS  and  CKOS  technology 
layouts.  Therefore,  expanding  RacPitts  to  produce  CMOS  CIF 
would  not  entail  a  comp lete  rewrite  of  the  back  end 
programs. 

An  important  feature  of  the  RacPitts  software  is 
the  functional  simulator  or  interpreter.  A  RacPitts  program 
is  not  only  an  IC  specification,  it  is  also  an  algorithmic 
specification.  The  interpreter  executes  the  specification 
program  as  a  general-purpose  computer  using  an  interactive, 
screen-oriented  input/output  style.  By  invoking  this  option 
of  RacPitts  the  user  can  exercise  his  design,  thereby  vali¬ 
dating  (to  whatever  extent  the  exercise  is  complete)  its 
functional  fidelity.  Once  the  functional  simulation  is  done 
to  satisfaction,  RacPitts  can  be  restarted  without  setting 
the  interpreter  option.  This  produces  a  finished  layout  and 
corresponding  CIF  file.  By  using  the  same  language  to  drive 
both  the  interpreter  and  the  integrated  circuit  compiler, 
human  error  is  reduced. 

RacPitts  lacks  some  features.  It  has  none  of 
the  capabilities  of  I.I.R.S.T.  to  produce  a  test  pattern  to 
exercise  the  chip.  It  also  lacks  any  built-in  mechanism  to 
identify  worst-case  path  delays  or  to  predict  the  maximum 
clock  freguency  of  the  finished  chip.  It  does  keep  account 
of  conductivity  information,  however,  which  it  uses  to 
predict  chip  power  consumption. 

RacPitts  uses  a  "correct  by  construction" 
doctrine  in  the  laycut  process.  By  denying  the  user  the 
means  to  specify  the  layout  details  of  the  chip,  this 
approach  also  denies  the  user  the  opportunity  to  commit 


design  rule  errors  or  to  translate  the  specification  program 
into  a  non-corresponding  layout.  But  can  HacPitts  itself 
■ake  design  rule  errcrs? 

The  following  chapters  ezasine  hov  to  use 
BacPitts  to  produce  an  integrated  circuit  layout,  hew  to 
validate  the  design,  and  where  to  look  for  ways  to  iaprowe 
chip  performance. 


in.  fisiifi  aisems 
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1.  7EE  IHPOT  FZIS 

i.  maai aaAii  s£  iM  Maggots  laaaaaaa 

^  "HacPitts,  *  the  system  for  generating  a  custca  inte¬ 

grated  circuit,  is  also  "HacPitts,"  the  language  in  which 
the  algoritha  is  specified.  In  this  section  the  second 
|  aeaning  is  the  one  isplied.  All  of  the  information  which 

specifies  what  functional  behavior  is. required  of  a  VLSI 
circuit  is  coaaunicated  to  HacPitts  in  a  single  text  file. 
This  file,  which  nust  have  the  extension  ".aac",  is  written 
using  syntax  which  closely  reseables  that  of  the  LISP 
progranaing  language.  Because  the  HacPitts  coapiler  is 
iapleaented  in  LISP,  it  is  reasonable  to  expect  the  syntax 
of  the  HacPitts  design  language  to  follow  the  LISP  paren- 
j  thesixed  notation.  This  choice  was  nade  by  the  authors  of 

HacPitts  because  it  eliainates  the  need  for  a  separate 
parser. 

LISP  is  a  list  processing  language.  Its  data 
|  elements  are  "systolic  expressions"  nade  up  of  "atoas" 

(fundamental  word- like  objects  separated  by  spaces) ,  lists 
of  atcas,  lists  of  lists  of  atoas  and  so  on.  One  of  the 
strengths  of  LISP  is  the  ability  to  concatenate  atcas  or 
»'  lists  into  new  lists,  and  to  perform  other  operations  on  a 
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list  or  a  hierarchy  of  lists  to  produce  new  lists  aodified 
in  useful  ways.  LISP  has  many  built  in  functional  defini¬ 
tions  which  are  an  "environment"  of  specifications  fcr  the 
L  operations  to  be  performed  on  lists.  These  definitions  are 

all  contained  in  Tfre  franz  l^sp  Manual  [Ref.  13].  In  addi¬ 
tion  to  using  these  definitions,  the  LISP  user  is  free  to 
extend  the  LISP  envizcnaent  by  defining  new  functions  which 
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specify  ether  operatiens  on  lists.  Xhe  types  of  operations 
■ay  he  siaple  manipulations  of  the  atoas  by  partitioning  or 
perautation,  or,  if  the  atoas  which  coaprise  the  list  happen 
to  be  numbers,  arithmetic  operations  aay  be  perforaed.  Xhe 
definitions  of  the  operations  thenselves  aay  also  be  assem¬ 
bled  frea  lists  cf  acre  primitive  operational  atoas.  This 
functional  extension  of  operations  is  what  the  authors  of 
HacPitts  have  done  in  creating  the  HacPitts  Lisp 
environment. 

The  design  of  a  7LSI  circuit  can  be  thought  of  as  a 
list-building  process  in  which  the  lists  are  electrical 
ports,  registers,  interconnection  nets,  data  testing  opera¬ 
tions,  and  ultimately  a  string  of  words  which  define  a 
unique  patterning  of  silicon  in  the  aask  level  descriptive 
language,  C1F.  These  lists  are  built  according  to  rules 
contained  in  another  list— the  algorithmic  specification 
source  file.  Although  the  BacPitts  design  language  resea- 
bles  LISP  syntactically,  its  seaantics  is  different  and  much 
■ore  Halted.  A  powerful  feature  of  LISP  is,  for  exaaple, 
recursive  definition.  This  feature  is  absent  in  the 
HacPitts  design  language.  A  description  of  the  HacPitts 
graaaar  in  Eackus  noraal  fora  is  given  in  [Eef.  A]. 

In  its  aost  general  fora,  a  HacPitts  "program"  to 
specify  a  circuit's  behavior  consists  of  a  set  of 
"processes,"  each  of  which  executes  sequentially,  but  all  of 
which  run  in  parallel.  The  states  of  each  process  are 
fundaacntally  disjoint  from  those  of  the  other  processes. 
This  allows  the  hardware  for  each  process  to  run  indepen¬ 
dently  of  the  other  processes,  if  desired,  and  concurrently 
with  the  states  of  the  other  processes,  in  any  case.  The 
operations  perforaed  by  a  given  process  in  a  given  state  are 
specified  by  a  "fora."  Each  fora  corresponds  to  a  single 
aachine  state,  and  is  executed  in  one  clock  cycle.  A  state 
aay  be  given  a  naae  by  preceding  the  fora  with  a  label. 


Horaally  execution  proceeds  sequentially  froa  one  state  to 
the  following  state  in  the  .aac  file  at  each  clock  cycle.  A 
"go"  fora  can  he  used,  however,  to  deviate  froa  this  sequen¬ 
tial  flew  by  causing  the  naaed  state  to  be  executed  next 
instead  of  the  syntactically  following  state. 

Data  is  coaaunicated  between  the  data  path  and  the 
external  world  through  nportsn  which  have  the  sane  bit  width 
as  the  data  path.  Cnly  a  single  data  path  width  definition 
is  allowed  per  prograa.  A  port  aay  be  declared  "input," 
"output,"  "tri-state  output,"  or  "i/o. "  Ports  aay  also  be 
declared  as  "internal,"  in  which  case  they  siaply  cascade 
the  output  of  one  data  path  operation  to  the  input  of 
another.  The  data  path  aay  also  be  specified  to  ccntain 
registers.  The  difference  between  internal  ports  and  regis¬ 
ters  is  that  registers  can  store  data  indefinitely  after  it 
has  been  clocked  in,  whereas  ports  are  only  electrical  nodes 
in  the  data  path  and  therefore  do  not  store  data.  Ports 
siaply  are  arrays  of  naaed  terainals  for  conducting  data 
froa  cne  point  to  ancther. 

Control  of  operations  perforaed  on  the  data  by  the 
data  path  organelles  is  governed  by  the  Weinberger  array 
control  unit.  Control  outputs  froa  the  control  unit  to  the 
data  path  aay  deteraine,  by  aeans  of  their  control  over 
aultiplexer  organelles  within  the  data  path,  which  opera¬ 
tions  occurring  within  the  data  path  will  affect  downstrean 
organelles.  Status  outputs  froa  the  data  path  returning  to 
the  control  unit  allow  the  sequence  of  operations  perforaed 
by  the  control  unit  to  vary  depending  on  the  data  present 
either  in  the  registers  or  at  any  other  point  in  the  data 
path.  The  control  unit  functions  may  also  be  made  to  depend 
upon  external  inputs.  The  control  unit  coaaunicates  with 
the  outside  world  using  "signals,"  which  are  analogous  to 
the  "ports"  used  by  the  data  path  except  that  each  signal 
appears  on  a  single  wire.  Signals  may  be  declared  as 
"input,"  "output,"  "tri-state  output,"  "i/o"  or  "internal." 


Operations  performed  by  the  data  path  during  a  given 
state  are  specified  by  the  LISP  "setq"  fora.  The  setg 
causes  tie  data  path  to  evaluate  a  sequence  of  operations  on 
cither  input  port  data,  internal  port  data  or  register  data. 
(The  setq  aay  also  be  used  with  signals.)  The  result  of 
these  specified  operations  is  then  conducted  to  anetber 
named  port  or  loaded  into  a  data  path  register  during  the 
next  clock  cycle.  The  compiler  includes  enough  copies  of 
each  operator  in  the  data  path  so  that  separate  processes, 
intended  to  run  in  parallel,  do  not  conflict  over  the 
attempted  shared  use  of  a  single  resource.  The  data  path 
can  cascade  several  operations  together  in  a  single  fora. 
This  allows  forms  such  as  the  following  example,  which 
computes  a»b-c  using  2*s  complement  arithmetic,  to  execute 
in  one  clock  cycle: 

(setq  a  (♦  t  (1+  (not  c)))  . 

The  list  consisting  of  everything  on  the  preceding  line  is  a 
single  fera.  There  are  three  operators  in  this  expression: 

which  specifies  use  of  an  adder,  "1+"  which  specifies 
an  incrementer,  and  "not"  which  specifies  an  inverter.  Each 
operator  is  followed  by  its  operands  listed  in  symbolic 
notation.  Therefore,  the  single  operand  of  1+  is  the 
integer  that  results  from  evaluating  the  expression  "(not 
c)."  Vote  that  there  is  not  a  default  hierarchy  of  opera¬ 
tions  within  a  fora.  is  with  LISP,  the  order  of  operations 
in  HacPitts  must  be  specified  explicitly  by  the  use  of 
nested  parentheses. 

Sequences  of  setq  forms  normally  operate  sequen¬ 
tially,  each  being  executed  on  a  separate  clock  cycle.  By 
enclosing  the  forms  within  another  "parallelizing  fern,"  of 
which  "par"  is  an  example,  several  forms  can  be  made  to  run 
in  parallel,  gaining  speed  over  sequential  operation  at  the 
cost  cf  sore  hardware  and  hence  more  area  in  silicon.  The 
par  fora  is  used  as  follows: 


(par  feral  fora2  fora3...) 

Of  coarse  the  results  obtained  by  running  setg  feras  in 
parallel  aay  be  guite  different  froa  those  obtained  by 
running  then  all  seguentially  within  one  process.  Consider 
the  following  ezaaple  where  "a"  and  nbn  have  already  been 
declared  registers  (i.e.  aaster-slave  flip  flops) : 

(par  (setg  a  b) 

(setg  b  a))  . 

This  expression  will  result  in  exchanging  the  contents  of 
nan  with  contents  of  nb.n  The  exchange  will  be  done  in  cne 
HacPitts  clock  cycle.  This  action  is  Bade  possible  by  the 
input  isolation  which  occurs  during  the  flip-flop  operating 
cycle.  All  such  data  storage  eleaents  are  read  before  they 
are  written.  On  the  other  hand,  seguential  operation  of  the 
sane  setg's  is  iaplicd  in  the  following  process: 

(process  loadl  (setg  a  b) 

(setg  b  a) )  . 

This  process  will  lead  both  b  and  a  with  the  original 
contents  of  b,  and  reguire  two  cycles  to  do  it.  (Here 
"loadl"  aerely  furnishes  a  process  naae,  as  deaanded  by  the 
HacPitts  graaaar.)  He  have  used  two  lines  and  indented 
foraat  only  for  the  sake  of  clarity.  All  the  functional 
inforaation  needed  by  HacPitts  is  denoted  by  the  ordering  of 
foras  within  the  nests  of  parentheses. 

The  "cond"  fora  allows  the  conditional  execution  of 
ether  foras  it  contains  during  a  given  state.  It  consists 
of  a  list  of  guards,  only  one  of  which  is  to  be  executed. 
Each  guard  begins  with  a  "condition"  which  deteraines 
whether  the  reaaining  foras  in  the  guard  are  to  be  executed. 
The  first  guard  whose  condition  is  true  enables  the  execu¬ 
tion  of  the  foras  following  the  condition  in  that  guard. 
This  is  illustrated  by  the  following  exaaple  adapted  froa 
[Ref.  *]. 


(cond  (conditio si  (cond  (condi tion2  foral  for»2) 

(conditions  fora3  fora4  form5) 
(t  f  ora6) ) ) 

(conditionU  (cond  (conditions  fora7  £ora8)) 

(cond  (conditions  £ora9)) 
fora  10) ) 

This  exaaple  is  heavily  nested.  Nevertheless,  close  exaai- 
nation  reveals  that  the  outeraost  "(cond..."  has  only  tvo 
guards  in  its  list,  each  of  which  contains  other  "(cond..." 
fores.  The  tvo  guards  are: 

(conditiosl  (cond  (condition!  foral  fora2) 

(condition!  forn3  fora4  form5) 
(t  £ora6))) 

and 


(conditioc4  (cond  (conditions  fora7  fora8)) 

(cond  (conditions  fora9>) 
fora  10)  . 

If  ccnditionl  is  false  and  condi tion4  is  true  then  fora  10  is 
executed.  If  conditions  is  true  then  fora7  and  fora8  are 
executed  alcng  with  fcralO.  likewise  if  conditions  is  true 
then  fcra9  is  executed  in  parallel  as  well. 

The  seaantics  of  the  cond  stateaent  is  inherently 
parallel.  The  conditions  of  the  alternate  guards  are 
checked  in  parallel.  Likewise,  all  foras  within  the  guards 
are  executed  siaultaneously  in  one  clock  cycle.  The 
coapilec  aakes  the  conditions  of  different  guards  in  one 
cond  fora  autually  exclusive,  and  iapleaents  then  using 
coabinational  logic  in  the  control  unit  as  described  above. 
This  logic  is  used  to  enable  or  inhibit  the  execution  of 
foras  controlled  by  that  guard  in  parallel. 

Bote  that  the  fora: 


(cond  (t  foral  form2  form3  . ..)) 
is  used  to  enable  parallel  execution  of  several  forms  during 
one  clock  cycle  without  being  dependent  on  any  condition. 
(The  "t"  stands  for  "true.")  The  "(par..."  fora  already 
encountered  is  actually  just  a  shorthand  macro  expression 
for  the  "(ccnd  (t. fora. 

In  a  MacPitts  layout,  the  conditions  are  formed  in 
the  control  unit,  which  is  a  Weinberger  array  of  NOB  gates 
£Bef.  15].  Therefore,  they  are  not  limited  to  cnly  the 
sum~cf-prcducts  notation  used  by  P LA-based  finite  state 
machine  compilers.  The  conditions  are  derived  from  either 
signals  arriving  on  an  input  pin,  signals  from  the  data 
path,  or  signals  arriving  from  other  processes.  More 
complex  conditions  can  be  constructed  from  these  signals 
using  the  logical  operators  "and,"  "or"  and  "not"  to  build 
arbitrary  Boolean  expressions.  These  operators  are  part  of 
the  MacPitts  library  cf  functions.  Thus,  the  cond  statement 
is  cne  cf  the  most  powerful  features  for  providing  high 
performance  designs. 

Bith  this  brief  and  somewhat  condensed  description 
of  the  features  available  in  the  MacPitts  algorithmic 
language,  the  way  is  prepared  to  to  understand  an  example  of 
some  code  which  will  produce  a  complete  integrated  circuit 
chip.  A  full  detailed  description  of  all  the  facilities  of 
MacPitts  is  found  is  a  report  authored  by  its  creators 
£Bef.  16],  which  also  serves  as  a  fairly  complete  users' 
manual. 

2.  IMS  Multiplier  Examples 

Consider,  line  by  line,  figure  3.1  which  is  a 
listing  cf  the  file  multic.mac.  This  example  and  the  cne 
which  follows  it  are  inspired  by  similar  ones  in  [Bef.  16]. 
It  contains  all  of  the  design  information  needed  by  MacPitts 
to  produce  a  4  bit  ccnbina tional  multiplier.  On  any  line. 


1  ;  multiplier,  no  state  combinational 

2  (program  multic  4 


3 

(d«£  1  ground) 

4 

( def  ain  port  input  (2345)) 

5 

(d«f  bin  port  Input  (6  7  8  9)) 

6 

(def  ras  port  output  (10  11  12  13));  result 

7 

(def  rO  port  internal) 

S 

(def  rl  port  internal) 

9 

(def  r2  port  internal) 

10 

(de£  14  phia) 

11 

(def  15  phib) 

12 

(def  16  phic) 

13 

(def  17  power) 

14 

(always 

15 

(cond  ((bit  0  bin)  (satq  rO  (>>  (bit  0  ain)  ain))) 

16 

(t  (setq  rO  0))) 

17 

(cond  ((bit  1  bin)  (setq  rl  (>>  (bit  0  (+  rO  ain)) 

(+  rO 

ain) ) ) ) 

18 

(t  (setq  rl  (>>  (bit  0  rO)  rO)))) 

19 

(cond  ((bit  2  bin)  (setq  r2  (>>  (bit  0  (+  rl  ain)) 

(+  rl 

ain) ) ) ) 

20 

(t  (setq  r2  (>>  (bit  0  rl)  rl)))) 

21 

(cond  ((bit  3  bin)  (setq  res  (>>  (bit  0  (+  r2  ain)) 

(+  r2 

ain)))) 

22 

(t  (setq  res  (»  (bit  0  r2)  r2)))))) 

Figure  3.1  Baltic. aac  Source  File. 


text  following  a  semicolon  is  treated  as  a  conment,  which 
the  ccapiler  ignores.  line  2  tells  the  compiler  that  a 
"program"  (which  is  another  way  of  saying,  "circuit  design") 
called  "aultic"  starts  here,  and  that  the  data  path  is  4 
tits  wide.  Because  the  data  path  is  only  4  bits,  this 
siaple  aultiplier  will  only  be  able  to  output  nuabers  froa  0 
to  15.  Even  though  the  input  ports  are  also  four  bits  vide, 
we  aust  restrict  input  nuabers  to  only  those  whose  prcduct 
falls  in  the  range  of  values  froa  0  to  15.  Furthermore,  if 
this  algcritha  is  tc  give  correct  results  for  all  aulti- 
pliers,  without  overflow,  the  leading  bit  of  the  multipli¬ 
cand  aust  be  zero.  He  provision  is  made  to  output  a  flag  if 
the  dynaaic  range  of  the  aultiplier  is  exceeded. 

lines  3  through  13  declare  the  various  signals  and 
integer  data  words  input  to,  output  from  and  existing  within 
the  aultiplier.  line  3  assigns  the  ground  connection  to  pin 
1  which  is  always  in  the  upper  left  corner  of  the  layout; 


subsequent  pin  numbers  proceed  clockwise  f row  this  point 
around  the  layout  perineter.  Line  4  assigns  pins  2-5  to  an 
input  port  labeled  "ain."  This  input  is  the  aultiplicand. 
By  HacPitts  convention,  the  aost  significant  bit  (MSB)  of 
ain  is  read  fron  the  first  pin  on  the  list,  pin  2,  and  the 
least  significant  bit  (LSB)  fron  the  last  pin  on  the  list, 
pin  5.  Line  5  similarly  defines  the  multiplier  input  port, 
"bin."  line  6  assigns  an  output  port  labeled  "res"  (for 
result)  tc  another  block  of  4  pins.  This  port  also  serves 
as  the  accumulator  for  the  fourth  and  final  partial  product. 
Lines  7  through  9  define  3  internal  ports  (necessarily  of 
width  4  bits)  labeled  rO,  rl  and  r2.  These  serve  to  cascade 
the  three  stages  of  a  standard  shift  and  add  algorithm. 
Each  port  contains  one  of  the  first  three  partial  products, 
each  being  the  result  of  operations  conditioned  on  one  of 
the  multipler  bits.  Lines  10  through  12  assign  pins  to  the 
three  phase  clock,  whether  that  clock  is  used  by  the  circuit 
or  not.  In  multic.mac  the  clock  is  not  used.  Line  13 
defines  the  *  5  volt  direct  current  power,  Vdd,  connected  to 
pin  17. 

line  14  signifies  that  the  functions  which  follow, 
up  to  the  matching  right  parenthesis  on  line  22,  are  to 
execute  on  every  clock  cycle.  The  "(always..."  form  is 
really  the  "(process..."  form,  reduced  to  a  single  state. 
Boreover  in  this  case,  given  the  (always...  form,  and  given 
that  the  data  path  contains  only  ports  and  not  registers, 
the  inputs  will  affect  the  result  after  an  interval  governed 
only  by  the  sum  of  the  physical  gate  delays  in  the  data  path 
and  ccntrol  unit.  There  is  no  controlled  latency  in  the  data 
path,  because  there  are  no  registers  in  this  design  in  which 
to  store  data. 

lines  IS  through  23  contain  the  shift  and  add 
scheme.  In  lines  15  and  16  the  controller  is  told  to 
examine  bit  0  (the  LSB)  of  bin.  If  it  is  high  (true)  the  rO 


fort  takes  on  the  value  of  the  ain  port  rotated  right  by  one 
hit,  i.e.  rO  is  actually  connected  by  means  of  a  multiplexer 
organelle  tc  a  right  rotated  version  of  ain.  The  shift- 
right-one-bit  fora,  "»,"  takes  two  arguments.  The  second 
argument  specifies  vhat  data  word  is  being  shifted,  and  the 
first  tells  vhat  to  put  in  the  8SB  of  that  data  vord.  Thus, 
a  rotate  is  also  vithin  the  capabilities  of  the  shift  fora, 
as  it  is  applied  in  this  case.  If  bit  0  of  bin  is  not  high, 
then,  by  line  16,  the  rO  port— all  4  bits— is  connected  to 
ground.  In  lines  17  and  18  the  controller  is  told  to 
examine  kit  1  of  bin.  If  it  is  high,  then  rl,  the  next 
internal  Fort  in  the  data  path,  is  connected  to  a  right- 
rotated  version  of  the  sum  of  rO  and  ain.  The  adder  orga¬ 
nelle  in  BacPitts  performs  this  summation  as  a  standard 
ripple  carry  full  addition.  Dote  again  that  the  expression: 
(bit  0  (+  rO  ain)) 

in  line  17  turns  the  single  shift  operator  into  a  right 
rotate  operator  by  making  the  BSB  of  rl  contain  the  same 
value  as  bit  0  of  the  sum  of  rO  and  ain.  If  bit  1  of  bin  is 
lov,  on  the  other  hand,  line  18  instructs  the  controller  to 
connect  rl  to  simply  a  right-rotated  version  of  rO.  Note 
that  no  rotations  are  being  performed  by  any  of  these  opera¬ 
tions  in  the  sense  that  a  shift  register  vould  perform  them. 
It  is  only  the  interconnections  between  organalles  that  are 
being  set  up  variously  by  the  controller  to  give  an  appear¬ 
ance  of  forwarding  a  rotated  version  down  the  data  path. 
Also  note  that  even  though  the  addition  fora  appears  twice 
in  line  17,  logically  only  one  adder  need  be  instantiated, 
since  the  operands  are  identical  in  both  occurrences. 
BacPitts,  tco,  can  recognize  this,  and  will  not  waste  space 
creating  more  adders  than  the  minimum  necessary.  In  lines 
19  and  20  the  controller  examines  bit  2  of  bin.  If  it  is 

high,  port  r2  is  connected  to  a  right-rotated  version  cf  the 
sum  of  rl  and  ain.  If  bit  2  of  bin  is  lov,  r2  is  connected 


to  a  £ight> rot a ted  version  of  rl.  In  lines  21  and  22  the 
controller  finally  ezaiines  the  USB,  bit  3,  of  bin.  If  it 
is  high/  the  output  port/  res#  is  connected  to  a  right- 
rotated  version  of  the  sum  of  r2  and  ain.  If  bit  3  of  bin 
is  low,  res  is  connected  to  a  right  rotated  version  of  r2. 
For  concreteness,  a  schematic  trace  of  this  algorithm  in 
acticn  on  the  problem  "4x3=12"  is  presented  is  figure  3.2. 


ain=4 

0100 


bin=3 

0011 


Algorithm 

Statement 

Result 

(seta  rO 
(»  (bit 

0  ain)  ain)) 

r0=2 

0010 

(setg  rl  (»  (bitO 
(+  rO  ain))  {♦  rO  ain))) 

r  1=3 
0011 

(setg  r2 
(»  (bit 

0  rl)  rl)) 

r  2=9 
1001 

(setg  res 
(»  (bit 

o  r2)  r2) ) 

res=12 

1100 

Figure  3.2  Example  of  the  Bultic  Behavioral  Specification. 


For  comparison/  consider  now  another  design.  This 
one  is  specified  by  the  file  aultip. mac  shown  in  figure  3.3 
This  is  a  four  bit  pipelined  multiplier  in  which  the  product 
does  net  appear  at  the  result  port  until  the  third  clock 
cycle  after  values  have  been  applied  to  the  inputs,  ain  and 
bin.  Changing  the  combinational  design  to  a  pipelined 
design  can  most  easily  be  accomplished  in  two  steps.  First, 


1 

;  multiplier,  with  pipelining 

2 

(program  multip  4 

3 

(def  1  ground) 

4 

(def  ain  port  input  (2345)) 

S 

(def  aO  register) 

6 

(def  al  register) 

7 

(def  a2  register) 

8 

(def  bln  port  input  (6  78  9)) 

9 

(def  bO  register) 

10 

(def  bl  register) 

11 

(def  b2  register) 

12 

(def  res  port  output  (10  11  12  13)) 

13 

(def  rO  register) 

14 

(def  rl  register) 

15 

(def  r2  register) 

16 

(def  14  phia) 

17 

(def  15  phib) 

18 

(def  16  phlc) 

19 

(def  reset  signal  input  17) 

20 

(def  18  power) 

21 

(always 

22 

(cond  ((bit  0  bin)  (setq  rO  (>>  (bit  0  ain) 

ain) ) ) 

23 

(t  (setq  rO  0))) 

24 

(cond  ((bit  1  bO)  (setq  rl  (>>  (bit  0  (+  rO 

aO) )  (♦  rO 

aO)))) 

2S 

(t  (setq  rl  (>>  (bit  0  rO)  rO ) ) ) ) 

26 

(cond  ((bit  2  bl)  (setq  r2  (>>  (bit  0  (+  rl 

al))  (+  rl 

al)))) 

27 

(t  (setq  r2  (>>  (bit  0  rl)  rl)))) 

28 

(cond  ((bit  3  b 2)  (setq  res  (»  (bit  0  (+  r2 

a2) )  (+  r2 

a2) ) )  ) 

29 

(t  (setq  res  (>>  (bit  0  r2)  r2)))) 

30 

(cond  (reset  (setq  aO  0) 

31 

(setq  bO  0) 

32 

(setq  al  0) 

33 

(setq  bl  0) 

34 

(setq  a2  0) 

35 

(setq  b2  0) ) 

36 

(t  (setq  aO  ain) 

37 

(setq  bO  bin) 

38 

(setq  al  aO) 

39 

(setq  bl  bO) 

40 

(setq  a2  al) 

41 

(setq  b2  bl))))) 

Figure  3.3  Bultip.aac  Source  file. 

the  three  internal  ports  of  Baltic,  rO,  rl  and  r2,  are  all 
redefined  as  registers.  Then  six  other  nev  registers,  a0-a2 
and  t0-t2  are  defined  to  send  successive  values  of  the 
inputs  ain  and  hin  down  the  pipe  in  step  with  their  ccrre- 
sponding  partial  products.  The  ease  with  which  this  is  done 
(froa  a  user's  point  of  view}  is  evidence  of  the  power  of 
BacPitts  to  create  cvstoa  designs. 


41 


Beferring  to  figure  3.3  ve  see  that  the  shift  and 
add  algorithm,  lines  22-29 ,  is  identical  to  that  of 
■ultic.aac.  In  line  19  pin  17  is  defined  as  a  "reset14  signal 
input.  The  reset  signal  is  required  for  any  MacPitts  design 
which  uses  one  or  acre  "process'4  forms  in  order  that  the 
program  counters  for  all  processes  can  always  he  reset  to 
the  same  known  state.  This  is  obviously  vital  when  two  or 
more  processes  on  the  same  chip  must  be  synchronized.  In 
the  multip  design,  however,  which  uses  the  "(always..." 
form,  the  reset  signal  performs  no  such  built  in  automatic 
function.  The  reset  signal  is  available,  however,  for  user- 
specified  functions  as  well,  and  in  this  case  is  used  only 
to  signal  a  setg  of  all  internal  multiplier  and  multiplicand 
registers  to  zero,  instead  of  passing  the  values  one  more 
step  down  the  pipeline.  Therefore,  the  reset  is  not  essen¬ 
tial  to  the  pipeline  multiplier  operation  here  but  only  acts 
to  allow  the  pipeline  to  be  emptied  out  and  to  inhibit  any 
new  input  data  from  propagating  to  completion,  for  what  that 
may  be  worth  in  whatever  the  intended  application.  It  is 
included  here  for  illustration  only.  Recall  that  propaga¬ 
tion  of  all  input  data  in  the  pipeline  (lines  30-35  or,  if 
reset  is  false,  lines  36-41)  occurs  in  a  single  clock  cycle 
as  well,  because  these  setg's  are  enclosed  in  the  "  (cond. .." 
fora,  which  causes  them  to  be  executed  in  parallel. 

B.  I1YCCA1IOI  OPTIOIS 

Eguipped  with  one  or  more  .mac  files  written  to  reflect 
the  desired  behavior  cf  a  circuit,  the  user  is  ready  to  run 
aacpitts.i  The  fora  of  the  command  line  invocation  from  the 
OHIZ  shell  is  simply 

X  aacpitts  <progran_name>  <opti  <;■> 

lThe  name  assigned  to  the  executable  binary  file  cn  the 
0 MI Zp£|e rating  system  which  embodies  the  HacPitts  system  is 


where  <prograa_naae>  would  be  either  aultic  or  aultip,  in 
the  case  of  the  previous  exaaples#  and  <options>  is  any  or 
none  of  the  words  fret  the  list: 


stat* 
herald 
cif* 
ob  j* 


nostat* 

noherald* 

nocif 


obj*  ncobj 

int  noint* 

opt-d*  noopt-d 

opt-c*  ncopt-c 

4u  5u* 

where  the  *  options  are  the  defaults  and  the  left  and  right 
coluans  are  autually  exclusive. 

Ihe  "stat"  option  tells  aacpitts  to  output  statistics  about 
the  chip  design  to  the  standard  output  device  (terainal 
screen#  noraally)  as  various  paraaeters  are  calculated. 
Figure  3.4  shows  the  statistics  generated  for  the  aultip 


1  Statistic 

2  Statistic 

3  Statistic 

4  Statistic 

5  Statistic 

6  Statistic 

7  Statistic 

8  Statistic 

9  Statistic 
XO  Statistic 

11  Statistic 

12  Statistic 

13  Statistic 

14  Statistic 

15  Statistic 


for  project  multip 

options:  (\5u  herald  opt-d  opt-c  stat  obj  cif) 

Maximum  control  depth  is  4 

Number  of  gates  is  60 

Data-path  has  25  Units 

Control  has  69  columns 

Circuit  has  1129  transistors 

Control  has  17  tracks 

Power  consumption  is  0.172120  Watts 

Data-path  internal  bus  uses  5  tracks 

Dimensions  are  6.320000  mm  by  2.847500  mm 

Memory  used  -  526k 

Compilation  took  30.432777  CPU  minutes 
Garbage  collection  took  18.520277  CPU  minutes 
For  a  total  of  796  garbage  collections 


Figure  3.4  Compiler  Statistics  for  sultip. 


chip.  Ihe  leaning  of  these  statistics  is  as  follows. 

Line  1  siaply  echoes  the  prograi  naae  which  was  given  at  the 
beginning  of  the  aultip. aac  source  file. 


line  2  summarizes  the  invocation  options  in  effect  either  by 
user  selection  or  default. 

line  3  gives  the  worst-case  number  of  logic  levels  between 
any  inpat  and  any  output  in  the  control  unit, 
line  4  gives  the  total  nuaber  of  HOB  gates  needed  in  the 
control  unit. 

Line  5  is  the  nuaber  of  data  path  "organelle  units,"  where 
an  organelle  unit  is  a  word- length  asseably  of  organelle 
bits.  This  nuaber  is  the  saae  as  the  nuaber  of  eleaents  in 
the  data  path  list  of  the  aultip.obj  file. 

line  6  is  the  nuaber  of  vertical  aetal  columns  in  the 
control  array,  excluding  the  ground  coluans. 
line  7  is  the  total  nuaber  of  transistors  in  the  circuit, 
including  the  data  path,  control  unit,  and  all  bonding  pads, 
line  8  is  the  stack  height  of  horizontally  running  polysi- 
licon  lines  used  to  intraconnect  the  control  unit, 
line  9  is  an  estiaate  of  the  worst-case  static  power 
consuapticn  of  the  chip  obtained  using  the  layout  topology, 
heuristic  values  of  undeterained  origin  for  the  conductivity 
of  each  electrical  feature,  and  a  5  volt  power  supply, 
line  10  is  the  aaxiaua  stack  height  of  horizontally  placed 
polysilicon  lines,  per  bit  in  the  data  path,  needed  to 
interconnect  the  organelles. 

line  11  is  the  overall  outline  size  of  the  chip  layout, 
line  12  is  the  peak  storage  allocation  deaanded  by  aacpitts 
during  the  run. 

line  13  is  the  CFO  tine  required  for  coapilation  and  layout, 
which  is  always  less  than  the  apparent  running  tiae  by  an 
aaount  which  depends  on  the  average  system  usage  rate, 
lines  14  and  15  reflect  a  function  of  Franz  Lisp  wherein 
past  used  storage  locations  are  reclaiaed  for  the  available 
aeaory  list.  The  last  three  statistics  were  probably 
included  because  aacpitts  can  be  very  deaanding  of  computing 


resources 


The  "herald*1  optics  outputs  messages  to  the  terminal 
screes  at  each  ailestose  is  the  soaetiaes  lengthy  coapila- 
tios  process.  These  reassure  the  user  that  oacpitts  is 
still  russisg.  Is  additios  to  heralding  what  point  in  the 
design  process  aacpitts  is  currently  working  on,  information 
on  current  accuaulated  CPU  tiae  and  CPU  garbage  collection 
tiae  is  printed  at  the  beginning  of  each  herald  line  in 
units  of  sixtieths  of  a  second. 

The  "cif"  option  keys  the  coapiler  to  output  a  mask 
level  description  .cif  file  in  the  Caltech  Intermediate 
Pora.  The  cif  optics  is  noraally  not  deselected  unless  the 
available  disk  storage  space  is  limited  and  the  user  is  only 
interested  in  reading  the  statistics  for  his  compiled 
design.  (The  cif  file  for  a  relatively  simple  design, 
aultip.cif,  is  over  158  kilobytes  long.)  If  no  cif  is 
produced  on  a  given  aacpitts  run,  the  entire  lay-nt  process 
aust  be  repeated  to  subseguently  obtain  a  cif  file.  This  is 
done  aost  expeditiously  by  running  aacpitts  with  the  nocbj 
option. 

The  "nocbj"  option  tells  aacpitts  to  start  with  a  previ¬ 
ously  created  object  file  (the  output  of  the  aacpitts  "front 
end,")  rather  than  a  source  file.  HacPitts  will  then  effec¬ 
tively  start  at  the  "back  end,"  doing  the  layout  and 
outputing  statistics  and  cif,  assuming  these  are  included  in 
the  eptiens  list. 

"Int"  tells  aacpitts  to  use  the  interpreter  node,  which 
allows  functional  siaulation  of  the  chip  without  actually 
perferaing  the  layout  and  generating  a  .cif  file. 

"Cpt-c"  and  "opt-d"  invoke  optimization  routines  for 
normalization  of  the  combinatorial  logic  of  the  control 
unit.  Investigation  of  the  four  possible  coabinations  of 
these  two  options  reveals  that  they  do  not  affect  the 
overall  dimensions  of  the  final  8  bit  aultiplier  design  (to 
be  described  later.)  This  is  probably  because  the  pins. 
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data  path  layout  and  tus  wiring  doainate  the  chip  area,  not 
the  control  unit,  which  is  coaparatively  saall  for  this 
chip.  lhe  coapilation  tiae  required,  however,  was  approxi- 
aately  20  percent  greater  when  opt-c  and  opt'd  were  used 
than  when  they  were  not  used.  Using  opt-c  and  opt'd  does 
reduce  the  coaplexity  of  the  control  unit,  and  therefore 
will  reduce  signal  delays,  to  the  benefit  of  operating 
speed. 

lhe  "4u"  option  sets  the  ainiaua  feature  size  for  the 
layout  to  4  aicrons,  and  accordingly  laabda,  the  coaaonly 
used  paraaeter  which  represents  the  half  line  width  diaen' 
sion,  is  set  to  200  centiaicrons. 

mother  option,  logo,  was  available  in  the  original 
aacpitts,  but  is  not  supported  at  HPS  because  suitable  font 
files  are  not  currently  available. 


C.  USE  01  THE  H  1C  PIUS  IITIRPBBTEH 

Invoking  aacpitts  with  the  int  option  should  be  the 
first  step  in  every  Bacpitts  design  cycle.  Bacpitts  has 
good  facilities  for  catching  graaaatical  errors  in  the 
user's  .aac  source  cede  which  operate  whether  or  not  the 
interpreter  is  invoked.  After  the  .aac  file  passes  graaaar 
checks,  the  interpreter  allows  the  extracted  algor ithaic 
description  to  be  exercised  with  arbitrary  inputs.  lhe 
results  are  displayed  on  the  screen  to  provide  an  indication 
that  the  design  is  functionally  correct.  Assuaing  the 
user's  path  list  is  set  up  in  the  .login  file  to  include  the 
directory,  /vlsi/aacpit,  the  following  coaaand  can  be 
issued: 

%  aacpitts  aultip  int  herald 

Ihis  will  cause  aacpitts  to  scan  the  aultip. aac  source  file 
and  extract  froa  it  the  circuit  behavior  inforaation.  Then 
aacpitts  will  display  a  table  of  all  declared  perts. 


registers,  flags,  signals  and  processes,  noting  that  they 
ar<~  all  currently  undefined.  The  user  aay  select  for 
display,  at  this  point,  a  aenu  of  interactive  coaaands  which 
clearly  states  hcv  tc  interact  with  the  interpreter.  The 
user  can  set  the  values  of  input  ports  and  signals  as 
desired,  let  all  internal  ports  vill  necessarily  be  defined 
siaply  by  setting  the  input  ports.  Generally  several  clock 
cycles  aust  be  siaulated  before  the  chip  internals  are  all 
defined.  Hacpitts  tells  the  user  which  antecedants  stand  in 
the  way  of  resolving  data  definitions.  Next  the  user  will 
probably  single  step  (or  aulti  step)  the  aacpitts  clock 
while  observing  the  effect  on  the  internal  registers  and 
output  port  (s)  after  each  cycle.  There  is  also  provision  to 
write  out  the  current  state  of  the  circuit  to  a  file, 
aultip.int.  Any  nuaber  of  states  can  be  saved  by  appro¬ 
priate  renaming  of  files  as  they  are  written.  Since 
aacpitts  does  not  allow  the  user  to  specify  different  file 
naaes  for  each  state  saved,  newly  written  .int  files  can 
iaaediately  be  renaaed  uniquely  from  an  adjacent  terminal 
logged  on  tc  the  sate  account  as  the  one  running  aacpitts. 
This  is  ccapletely  feasible  under  UNIX. 

As  an  exaaple,  figure  3.5  shows  a  concatenated  listing 
of  4  such  files  fres  a  single  session  with  the  aacpitts 
interpreter.  As  would  be  expected,  the  format  cf  these 
files  is  that  of  a  LISP  list,  whose  meaning  can  be  clearly 
inferred  because  it  follows  the  saae  syntax  as  the  NacPitts 
language  itself.  The  first  file,  lines  1-14,  is  a  dump  of 
the  state  of  the  circuit  after  setting  the  input  ports  ain 
and  bin  to  4  and  3,  respectively,  and  the  reset  signal  to 
false.  Note  that  all  data  downstream  of  the  inputs  is  still 
undefined  at  this  point.  Lines  16-28  show  the  result  after 
one  clock  cycle.  lines  30-42  show  the  result  after  two 
clock  cycles.  Lines  44-56  show  the  result  after  the  third 
clock  cycle  when  the  result,  12,  is  present  for  the  first 
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5  k  Haefitts  Interpreter  Session  for  salt ip* 
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time  oa  the  oat  pat  pert.  Hote  that  at  this  point  the  inpat 
data,  which  was  never  changed  daring  this  session,  has  also 
propagated  down  the  three  stage  pipeline.  Of  coarse,  one 
would  normally  not  ase  a  pipelined  processor  with  static 
data,  because  the  advantage  of  higher  throughput  is  wasted. 
The  exercise  only  serves  to  deaonstrate  the  behavior  of  the 
interpreter  option. 

Two  points  of  practical  interest  should  be  aade  before 
closing  the  interpreter  discussion.  First,  it  should  be 
observed  that  the  betton  lines  of  text  in  the  terninal 
display  will  be  juabled  on  the  ADM-36  terminals  because  the 
/etc/teracap  libraries  in  QUIZ  version  4.2  differ  slightly 
froa  those  in  OH IX  version  4.1.  Proper  screen  presentation 
is  obtained,  however,  if  the  GIGI  terminal  is  used.  Second, 
the  interpreter  runs  very  slowly.  It  is  not  unusual  during 
hours  of  heavy  system  useage  for  one  to  two  minutes  of 
terainal  tiae  to  elapse  while  the  interpreter  is  processing 
a  single  comaand  to  cycle  the  clock.  At  night,  with  only  2 
users  logged  on,  this  clocking  operation  only  takes  ten  to 
fifteen  seconds. 


E.  BVOLUTICI  OF  THE  8  BIT  PIPEIIHEO  H01TIPLIES 
1.  Design  Hotivation  and  Constraints 

Cne  possible  application  for  a  digital  pipelined 
aultiplier  cf  unsigned  integers  is  as  part  of  a  high  speed 
digital  filter  realization.  iork  done  by  Loomis  and  Sinha 
[Hef.  17]  indicates  that  the  impact  of  pipelining  delays  on 
the  behavior  of  digital  recursive  filters  can  be  compensated 
for  by  adjusting  the  filter  weights.  Furthermore,  their 
work  shows  that  the  stability  of  the  filter  can  be  improved 
by  increasing  the  number  of  pipeline  stages.  It  was  decided 
that  the  design  of  a  aultiplier  for  such  applications  could 
be  a  suitable  vehicle  from  which  to  study  the  BacFitts 
compiler. 


The  design  of  circuits  which  can  be  fabricated  using 
the  available  ARPA/MOSIS  implementation  service  is 
constrained  by  two  standard  paraneters:  a  max i mum  project 
size  of  6890.  z  6300  microns,  and  a  maximum  bonding  pad  count 
of  64  pins.  To  full;  explore  the  capabilities  of  MacPitts, 
it  is  probably  most  enlightening  to  proceed  in  steps  toward 
the  ultimate  design. 

2.  first  Design;  3  Stages.  §  Bits  on  One  Chip 

To  tetter  appreciate  the  issue  involved,  the  first 
design  is  an  expansion  of  multip.mac  to  an  8  bit  wide  data 
path  with  enough  ncondn  forms  to  realize  an  8  bit  multipli¬ 
cation.  Note,  however,  that  the  MSB  of  the  multiplicand 
(ain)  must  be  zero  tc  avoid  overflows  of  the  partial  product 
and  results  ports.  Two  output  ports  are  used,  one  for  the 
high  order  8  bits  of  the  result  (hres) ,  and  one  for  the  low 
order  8  bits  of  the  result  (Ires) .  Together  these  ports 
form  a  16  bit  product.  One  expects  the  hres  MS B  always  to 
be  zero  because  the  largest  valid  product  is  127x255=32385, 
which  is  less  than  2*s.  Because  the  design  has  three  sets 
of  registers,  there  are  three  stages  of  pipelining,  and 
there  is  room  in  the  chip  for  three  distinct  multiplication 
problems  to  be  in  process  simultaneously.  A  speed  vs.  area 
tradeoff  is  effected  by  alternating  ports  with  registers  in 
the  data  path.  Ports  consume  less  area  than  registers. 
However,  ports  also  introduce  more  delay  in  the  pipeline 
stages  (whose  boundaries  are  defined  by  registers)  thereby 
lowering  the  maximum  clock  freguency.  To  further  save  area, 
the  multiplier  bits  from  bin  share  space  in  the  low  crder 
intermediate  results  registers  (IrO,  lrl,  lr2)  and  ports 
(lpO,  lpl,  lp2,  lp3.  Ires)  by  using  the  following  device: 
after  each  bit  of  the  multiplier  is  tested,  it  is  shifted 
off  the  right  end  of  the  register/port,  leaving  room  at  the 
left  end  for  another  bit  of  the  low  order  result  to  be 


shifted  in.  The  source  file  for  this  design,  aultip8.aac, 
is  shewn  in  figures  3.6  and  3.7.  This  file  was  arrived  at 
after  first  considering  vhat  resources  would  be  needed  to 
per fora  the  aultiplication.  Then  register/port  teaplates 
were  written  down  on  paper,  and  the  flow  of  data  traced  for 
a  specific  case.  Seat  the  algoritha  depicted  by  the  data 
flow  was  translated  into  HacPitts  language  resulting  in  a 
diagraa  reseabling  the  style  of  figure  3.2.  Finally  the 
definitions,  conditions,  and  reset  functions  were  added  to 
coaplete  the  aultip8.aac  file.  Figure  3.8  partially  illus¬ 
trates  the  aanner  in  which  this  was  done  for  the  exanple 
104x22=2288.  Only  the  first  pipeline  stage  is  shown,  repre¬ 
senting  the  first  twe  aultiplier  bits. 

Figure  3.9  shews  the  linear  arrangeaent  of  the  ports 
and  registers  in  the  data  path  for  this  aultiplier,  as  well 
as  the  placeaent  of  shift  and  add  organelles.  The  flow  of 
data  is  down  the  page.  The  large  size  of  the  full  adders 
relative  to  the  other  organelles  is  not  reflected  in  the 
scale  of  this  figure.  The  resulting  aacpitts  layout  for 
this  design  aeasures  11848  x  4897.5  aicrons,  which  is  far 
too  large  to  be  fabricated  in  a  standard  HOSIS  run.  It 
appears  that  the  design  aust  therefore  be  partitioned  in 
soae  way  aaong  two  cr  aore  chips.  Ideally,  these  parti¬ 
tioned  "partial  aultipliers"  should  be  identical  in  design 
if  fabrication  and  testing  costs  are  to  be  ainiaized. 

3.  fiys^  Partitioning:  2  Bits.  J,  Stage  Pipeline 


The  aultip8  design  aay  be  partitioned  in  a  nuaber  of 
ways.  The  first  approach  sight  be  to  process  two  aultiplier 
bits  on  the  chip  using  one  register  stage  and  then  cne  pert 
stage  to  hold  the  two  partial  products  in  a  single  pipeline 
stage,  then  pipe  the  partial  result  to  another  identical 
chip.  Such  a  design  zequires  4  chips  to  do  a  coaplete  7  bit 
by  8  bit  aultiplication  with  4  stages  of  pipelining  in  all. 
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28 

(def 

reset  signal  input 

37) 

29 

(def 

38  power) 

30 

;  end  of  definitions 

31 

(always 

32 

(cond  ((bit  0  bln) 

33 

(setq  hpO  (>> 

ain)) 

34 

(setq  lpO  (>> 

(bit  0 

ain)  bin))) 

35 

(t 

36 

(setq  hpO  0) 

37 

(setq  lpO  (>> 

bin)))) 

38 

(cond  ((bit  0  lpO) 

39 

(setq  hrO  (» 

(+  hpO 

ain))) 

40 

(setq  IrO  (>> 

(bit  0 

(+  hpO  ain) )  lpO) ) ) 

41 

(t 

42 

(setq  hrO  (>> 

hpO) ) 

43 

(setq  IrO  (>> 

(bit  0 

hpO)  lpO)))) 

44 

(cond  ((bit  0  IrO) 

45 

(setq  hpl  (>> 

(+  hrO 

aO))) 

46 

(setq  lpl  (» 

(bit  0 

(+  hrO  aO) )  IrO) ) ) 

47 

(t 

48 

(setq  hpl  (>> 

hrO) ) 

49 

(setq  lpl  (>> 

(bit  0 

hrO)  IrO ) ) ) ) 

50 

(cond  ((bit  0  lpl) 

51 

(setq  hrl  (>> 

<+  hpl 

aO))) 

52 

(setq  lrl  (>> 

(bit  0 

(+  hpl  aO))  lpl))) 

53 

(t 

54 

(setq  hrl  (>> 

hpl)) 

55 

(setq  lrl  (» 

(bit  0 

hpl)  lpl)))) 

56 

(cond  ((bit  0  lrl) 

57 

(setq  hp2  {» 

(+  hrl 

al))) 

58 

(setq  lp2  (>> 

(bit  0 

(+  hrl  al))  lrl))) 

59 

(t 

60 

(setq  hp2  (>> 

hrl)) 

61 

(setq  lp2  (>> 

(bit  0 

hrl)  lrl)))) 

Figure  3.6  ialtip8.aac  Source  File. 
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62 

(cond 

((bit  0  lp2) 

63 

(setq  hr2  (>>  (+  hp2  al)j) 

64 

(setq  lr2  (»  (bit  0  (+  hp2  al))  lp2) ) ) 

65 

(t 

66 

(setq  hr2  (>>  hp2) ) 

67 

(setq  lr2  (>>  (bit  0  hp2)  lp2)))) 

68 

(cond 

(.(bit  0  lr2) 

69 

(setq  hp3  (»  (+  hr2  a2) )) 

70 

(setq  lp3  (»  (bit  0  (+  hr2  a2) )  lr2))) 

71 

(t 

72 

(setq  hp3  {»  hr2) } 

73 

(setq  lp3  (»  (bit  0  hr2)  lr2)))) 

74 

(cond 

((bit  0  lp3) 

75 

(setq  hres  (>>  (+  hp3  a 2 ) ) ) 

76 

(setq  Ires  (>>  (bit  0  (+  hp3  a 2))  lp3)>) 

77 

(t 

78 

(setq  hres  (>>  hp3) ) 

79 

(setq  Ires  (>>  (bit  0  hp3)  lp3)))) 

80 

(cond 

(reset 

81 

(setq  aO  0) 

82 

(setq  al  0) 

83 

(setq  a2  0) ) 

84 

(t 

85 

(setq  aO  ain) 

86 

(setq  al  aO) 

87 

(setq  a2  al) ) } ) ) 

Figure  3.7  Hal.tip8.aac  Source  Pile  (Continued). 


Figure  3.10  is  a  block  diagram  of  this  design  approach.  The 
BacPitts  source  file  for  this  design,  given  in  figure  3.11, 
defines  another  input  port,  "fain,"  which  should  be  connected 
to  the  high  order  8  bit  partial  product  output  of  the 
previous  stage,  unless  the  chip  is  the  first  one  in  the 
array.  In  that  case,  "hin"  is  connected  to  ground  (i.e. 
zero.)  To  further  reduce  area,  the  reset  function  was  elim- 
inated,  because  it  is  not  in  any  way  essential  to  the  func- 
tioning  cf  a  multiplier  used  in  a  high  throughput  signal 
processing  environaent  such  as  is  envisioned  for  this 
design. 

This  arrangeaent  of  identical  processing  eleaents 
connected  in  a  linear  array  to  produce  a  pipelined  result  is 
similar  in  concept  to  the  systolic  array  approach  formulated 
by  Kung  [Hef.  18],  although  he  was  more  generally  concerned 


Pigure  3.8  Use  of  Ports  and  Hegisters  in  nnltip8.sac. 

with  individual  processing  eleaents  of  greater  complexity 
than  that  of  aultip8a  cells. 

The  aacpitts  layout  of  aultip8a  has  outline  diaen- 
sions  of  5848  x  6140  Bicrons.  The  data  path  and  control 
unit  cnly  occupy  approximately  3000  z  2500  aicrons.  The 
overall  chip  is  large  compared  to  its  "working  circuitry" 
because  of  the  need  to  place  53  pin  pads  around  only  three 
sides  of  the  perimetex.  This  design  does  not  approach  full 
utilization  of  the  available  6890  z  6300  micron  silicon 
area. 

4-  SSSaui  Partitioning :  4  £±tg,  g  Sjfcaag  li22±i£§ 


It  seeas  clear  that  sore  of  the  design  will  fit  on 
the  chip  and  still  not  exceed  the  aaziaua  size  for 


Figure  3.9  Data  Path  Architecture  of  BultipS  Chip. 


fabrication.  Design  nultipSb  (source  file  shown  in  figure 
3.12)  tests  four  bits  of  the  aultiplier  on  one  chip,  there¬ 
fore,  only  two  of  these  chips  are  needed  to  do  a  coaplete 
■ultiplication.  Essentially  this  is  just  a  doubled  version 
of  sultip8a.  The  Bacfitts  layout  is  7130  x  6140  Bicrons  for 
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figure  3.10  Block  Diagraa  of  First  Partitioning. 


the  5  aicron  option.  This  is  too  large  to  fabricate.  Berun 
eith  the  4  aicron  option,  the  aultipSb  chip  has  satisfactory 
diaensions:  5884  x  6C24  Bicrons. 
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1 

2 

1  stag*  of  a  4-stag*  pipelined  multiplier 

2 

2 

product  is  a  16  bit  unsigned  integer 

3 

(program  nultip8a  8  ;  data  path  is  8  bits  wide 

4 

(def  1  ground) 

5 

(def  ain  port  input  (23456789))  ^multiplicand 

input 

6 

(def  bin  port  input  (10  11  12  13  14  15  16  17))  ;  multiplier 

input 

7 

2  this  port  also  receives  the  lower  8  bits  of  the  partial  product  f 

8 

(def  hin  port  input  (18  19  20  21  22  23  24  25))  2upp*r  8  bits  of  | 

9 

;  partial  product  from  preceding  stags,  zero  if  first  stags 

• 

10 

(dsf  aout  port  output  (26  27  28  29  30  31  32  33))  ;  multiplicand  output] 

11 

(def  hout  port  output  (34  35  36  37  38  39  40  41))  2  upper  8 

bits  of 

12 

2  partial  product  output 

13 

(def  lout  port  output  (42  43  44  45  46  47  48  49))  ;  lower  8 

bits  of 

14 

2  partial  product  output  and  shifted  multiplier  output 

IS 

(def  al  register) 

16 

(def  hrl  register) 

17 

(def  lrl  register) 

18 

(def  50  phia) 

19 

(def  51  phib) 

20 

(def  52  phic) 

21 

(def  53  power) 

22 

i 

end  of  definitions 

23 

(always 

24 

(cond  ((bit  0  bin) 

25 

(setq  hrl  (>>  (+  hin  ain))) 

26 

(setq  lrl  (>>  (bit  0  (+  hin  ain))  bin))) 

27 

(t 

28 

(setq  hrl  (>>  hin) ) 

29 

(setq  lrl  (>>  (bit  0  hin)  bin)))) 

30 

(cond  ((bit  0  lrl) 

31 

(setq  hout  (>>  (+  hrl  ain))) 

32 

(setq  lout  (>>  (bit  0  (♦  hrl  ain))  lrl))) 

33 

(t 

34 

(setq  hout  (>>  hrl) ) 

35 

(setq  lout  (>>  (bit  0  hrl)  lrl)))) 

36 

! 

37 

(setq  al  ain) 

38 

(setq  aout  al))) 

Figure  3.11  aulti.p8a.aac  Source  File. 


s.  I&iid  eattitigaisa-  Z  Sits,  i  auas  EiB&liss 

Ey  replacing  every  internal  port  with  a  register, 
and  providing  tvo  additional  corresponding  pipeline  regis- 
ters  for  the  multiplicand,  the  delay  per  pipeline  stage  can 
be  reduced  by  a  factor  of  approxiaately  tvo  because  the 
adders  drive  a  register  directly  instead  of  through  a  pert 
and  another  adder.  The  clock  rate  can  therefore  be  approxi- 
lately  dcubled.  This  modification  Has  another  attractive 
feature  in  that  it  allows  the  output  port  to  be  driven 
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1 

} 

2  stages  of  a  4-stage  pipelined  multiplier 

2 

i 

product  is  a  16  bit  unsigned  integer 

3 

(program  multipSb  8  ;  data  path  is  8  bits  wide 

4 

(def  1  ground) 

5 

(def  ain  port  input  (2  3  4  5  6  7  8  9))  (‘multiplicand 

input  1 

6 

(def  bin  port  input  (10  11  12  13  14  IS  16  17) )  ;  multiplier 

input  1 

7 

;  this  port  also  raceivas  tha  lowar  8  bits  of  the  partial  product 

8 

(def  hin  port  input  (18  19  20  21  22  23  24  25))  ;upper  8  bits  of 

9 

;  partial  product  from  preceding  stage,  zero  if  first  stage 

• 

10 

(def  aout  port  output  (26  27  28  29  30  31  32  33))  ;  multiplicand 

output 

11 

(def  hout  port  output  (34  35  36  37  38  39  40  41))  /  upper  8 

aits 

of 

12 

;  partial  product  output 

13 

(def  lout  port  output  (42  43  44  45  46  47  48  49))  ;  lower  8 

bits 

of 

14 

;  partial  product  output  and  shifted  multiplier  output 

15 

(def  al  register) 

16 

(def  a2  register) 

17 

(def  hrl  register) 

18 

(def  lrl  register) 

19 

(def  hpl  port  internal) 

20 

(def  lpl  port  internal) 

21 

(def  hr2  register) 

22 

(def  lr2  register) 

23 

(def  50  phia) 

24 

(def  51  phib) 

2S 

(def  S2  phic) 

26 

(def  S3  power) 

27 

; 

end  of  definitions 

28 

(always 

29 

(cond  ((bit  0  bin) 

30 

(setq  hrl  (>>  (+  hin  ain))) 

31 

(setq  lrl  (>>  (bit  0  (+  hin  ain))  bin))) 

32 

(t 

33 

(setq  hrl  (>>  hin)) 

34 

(setq  lrl  (»  (bit  0  hin)  bin)))) 

35 

(cond  ((bit  0  lrl) 

36 

(setq  hpl  (»  (+  hrl  ain))) 

37 

(setq  lpl  (»  (bit  0  (+  hrl  ain))  lrl))) 

38 

(t 

39 

(setq  hpl  (>>  hrl) ) 

40 

(setq  lpl  (»  (bit  0  hrl)  lrl)))) 

41 

(cond  ((bit  0  lpl) 

42 

(setq  hr2  (>>  (+  hpl  al))) 

43 

(setq  lr2  (»  (bit  0  (+  hpl  al))  lpl))) 

44 

(t 

45 

(setq  hr2  (>>  hpl) ) 

46 

(setq  lr2  {»  (bit  0  hpl)  lpl)))) 

47 

(cond  ((bit  0  lr2) 

48 

(setq  hout  (>>  (+  hr2  al))) 

49 

(setq  lout  (>>  (bit  0  (+  hr2  al))  lr2))) 

50 

(t 

51 

(setq  hout  (»  hr2) ) 

52 

(setq  lout  (»  (bit  0  hr2)  lr2)))) 

53 

» 

54 

(setq  al  ain) 

55 

(setq  a2  al) 

56 

(setq  aout  a2))) 

Figure  3.12  Holti.p8b.aac  Source  File. 
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1 

; 

2  stages  of  a  4-stage  pipelined  multiplier 

2 

; 

product  Is  a  16  bit 

unsigned  integer 

3 

{program  multipSc  8 

! 

data  path  is  8  bits  wide 

4 

<def  1  ground) 

5 

(def  ain  port  input 

(2 

3456789))  {multiplicand 

input  | 

6 

(def  bin  port  input 

(10 

11  12  13  14  15  16  17))  ;  multiplier 

input  1 

7 

;  this  port  also  roceivts  the  lower  8  bits  of  the  partial  product  f 

8 

(def  bin  port  input 

(18 

19  20  21  22  23  24  25))  ,-upper  8  bits  of 

9 

t  partial  product  from  preceding  stage,  zero  if  first  stage 

10 

(def  aout  port  output  (26  27  28  29  30  31  32  33))  ;  multiplicand 

output 

11 

(def  hout  port  output  (34  35  36  37  38  39  40  41))  j  upper  8 

sits 

of 

12 

j  partial  product  output 

13 

(def  lout  port  output  (42  43  44  45  46  47  48  49))  ;  lower  8 

bits 

of 

14 

;  partial  product  output  and  shifted  multiplier  output 

15 

(def  al  register) 

16 

(def  a2  register) 

17 

(def  a3  register) 

18 

(def  a4  register) 

19 

(def  hrl  register) 

20 

(def  lrl  register) 

21 

(def  hr2  register) 

22 

(def  lr2  register) 

23 

(def  hr3  register) 

24 

(def  lr3  register) 

25 

(def  hr4  register) 

26 

(def  lr4  register) 

27 

(def  50  phia) 

28 

(def  51  phib) 

29 

(def  52  phic) 

30 

(def  53  power) 

31 

; 

end  of  definitions 

32 

(always 

33 

(cond  ((bit  0  bin) 

34 

(setq  hrl 

<» 

(+  hin  ain))) 

35 

(setq  lrl 

(» 

(bit  0  (+  hin  ain))  bin))) 

36 

(t 

37 

(setq  hrl 

(» 

hin)) 

38 

(setq  lrl 

(» 

(bit  0  hin)  bin)))) 

39 

(cond  ((bit  0  lrl) 

40 

(setq  hr2 

(» 

{+  hrl  al) ) ) 

41 

(setq  lr2 

(» 

(bit  0  (+  hrl  al))  lrl))) 

42 

(t 

43 

(setq  hr2 

(» 

hrl)) 

44 

(setq  lr2 

(» 

(bit  0  hrl)  lrl)))) 

45 

(cond  ((bit  0  lr2) 

46 

(setq  hr3 

(» 

(+  hr2  a2) ) ) 

47 

(setq  lr3 

<» 

(bit  0  (+  hr2  a2) )  lr2))) 

48 

(t 

49 

(setq  hr3 

<» 

hr2) ) 

50 

(setq  lr3 

(» 

(bit  0  hr2)  lr2) ) ) ) 

51 

(cond  ((bit  0  lr3) 

52 

(setq  hr4 

(» 

(+  hr3  a3) ) ) 

53 

(setq  lr4 

(» 

(bit  0  (+  hr3  a3) )  lr3))) 

54 

(t 

55 

(setq  hr4 

(» 

hr 3) ) 

56 

(setq  lr4 

(>> 

(bit  0  hr3)  lr3) ) ) ) 

57 

; 

58 

(setq  hout  hr 4) 

59 

(setq  lout  lr4) 

60 

(setq  al  ain) 

61 

(setq  a2  al) 

62 

(setq  a3  a2) 

63 

(setq  a4  a3) 

64 

(setq  aout  a4))) 

Figure  3,13  HoltipSc.aac  Source  File. 
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directly  by  a  register  rather  than  by  an  adder.  Thus,  the 
output  data  is  valid  sooner  after  the  coapletioa  of  a.  clock 
cycle  than  it  was  in  the  case  of  aultip8b. 

Soae  rooa  to  spare  on  the  aultipSb  4  aicron  layout 
leaves  hope  that  this  four  stage  pipeline  algoritha,  figure 
3.13,  say  te  feasible.  In  fact,  the  aacpitts  layout  for 
aultip8c  a  ensures  8318  z  6140  aicrons  in  5  aicron  tech** 
nology.  In  4  aicrcn  technology  the  chip  aeasures  6766  z 
6024  aicrons,  which  consumes  alaost  94  per  cent  of  the 
aaxiaus  allowable  chip  area.  This  is  a  good  indication  that 
the  licit  aay  in  fact  have  been  reached  on  obtaining  any 
aore  elaborate  design  variations  for  the  aultiplier  which 
can  be  fabricated  by  the  standard  MOSIS  facilities. 

1  suaaary  of  statistics  produced  by  aacpitts  for  all 
the  aultiplier  designs  ezplored  in  this  chapter  is  given  in 
table  I.  Each  line  represents  a  different  cif  file,  soae  of 
which  aay  be  derived  froa  the  saae  source  file,  with  the 
only  difference  being  the  invocation  options.  The  roct  of 
each  entry  in  the  "DESIGN"  coluan  corresponds  to  the  naae  of 
a  aultiplication  algoritha  introduced  previously  in  this 
chapter.  To  clarify  the  notation  of  the  "DESIGN"  coluan, 
note  that  the  last  digit  gives  the  ainiaua  feature  size 
selected,  in  aicrons.  where  no  digit  is  ezplicitly  stated, 
the  ainiaua  feature  size  is  5  aicrons. 

E.  DESIG1  VALIDATION 

1 •  Functional  Siculaticn 

Before  proceeding  with  fabrication  it  is  necessary 
to  validate  the  aultip8c4  design  by  functional  siaulaticn, 
design  rule  checking  and  node  eztraction  with  subsequent 
event  siaulation. 
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In  any  functional  siaulation  the  first  issue  to 
address  is,  "How  exhaustive  shall  the  siaulation  be?"  Truly 
exhaustive  testing  cf  aultip8c4  is  a  formidable  task,  at 
best.  The  number  of  different  electrically  possible  combi¬ 
nations  of  bits  for  the  three  input  ports — ain,  bin  and 
hin — is 

(2«)3  *  2* 4  *  16,777,216. 

Then,  there  are  four  internal  pipeline  stages.  Therefore, 
ideally,  every  sequence  of  4  of  these  16,777,216  inputs 
should  be  tested  because  there  should  be  no  restrictions  on 
the  ordering  of  problems  in  the  pipeline.  This  considera¬ 
tion  increases  the  number  of  possible  states  to 
(16  ,777 ,216)  *  *  7.92x10**  states. 

Each  state  transition  requires  five  transitions  of  the  raw 
clock,  as  will  be  recalled  from  figure  2.4.  It  is  reason¬ 
able  to  assume  a  ram  clock  frequency  of  10  MHz  for  an  mhos 
circuit.  for  the  master-slave  flip  flops  used  in  HacPitts 
this  translates  to  a  state  transition  rate  of  2  MHz.  From 
this  assumption  the  time  to  cover  all  states  of  this  circuit 
is  calculated  to  be 

7.92x10**  states  /  2x10*  states/sec  =  3. 96x10**  seconds 
3.96x10**  sec  /  8.64x10*  sec/day  *  4.58x10*7  days 
4.58x1C17  days  /  365  days/year  *  1.26x10ls  years 
Therefore  testing  every  electrically  possible  state,  even 
once,  is  obviously  ixpractical. 

If  only  each  24  bit  input  combination  were  tested 
once,  withcut  regard  for  the  order  in  which  these  tests  were 
conducted,  the  time  required  is  only 

16,777  ,  216  /  2x10*  »  8.38  seconds. 

It  shculd  be  remembered  that,  in  its  intended  applies ticn, 
the  number  of  expected  input  combinations  to  multip8c  is 
considerably  smaller.  There  are  only  (255x127)  +1  or  32386 
possible  7x8  bit  multiplication  problems.  Each  of  these 
will  have  hin»0  on  the  first  chip.  The  second  chip  will 


have  crly  one  unigue  set  of  inputs  passed  to  it  by  the  first 
chip  for  each  of  these  32386  probleas.  Therefore,  the  total 
nuaber  oi  different  input  ccabinations  of  ain,  bin  and  bin 
that  sill  be  encountered  in  actual  operation  is  no  greater 
than  2x32386  or  64772.  The  precise  nuaber  is  soaeshat 
saaller  still  because  soae  probleas,  such  as  those  which 
have  zero  fcr  the  aultiplier  or  aultiplicand,  sill  output  a 
zero  frca  hout  in  the  first  chip  to  hin  of  the  second  chip 
thus  duplicating  the  first  chip  set  of  inputs  for  scae  ether 
problca. 

Hhen  using  the  aac pitts  interpreter  to  run  a  func¬ 
tional  siaulation,  at  least  fifteen  seconds  aust  be  allowed 
for  ccaputing  the  changes  at  each  clock  cycle.  This  fact 
aakes  testing  even  all  expected  input  coabinations  iaprac- 
tical.  Instead  one  randoa  preblea  is  chosen:  104  x  22  * 
228 8.  The  product  2288  is  represented  as  hout-00C01000»8, 
deciaal  and  lout*11110000»240,  deciaal,  since 
(256x£)  +240*2288.  figures  0.2  through  0.6  in  Appendix  D 
show  interpreter  output  files  for  each  of  the  8  clock  cycles 
needed  to  produce  the  result,  and  a  ninth  clock  cycle  to 
deaonstrate  that  the  output  is  not  subject  to  unccaaanded 
changes.  Between  clock  cycles  4  and  5  the  inputs  were 
changed  to  sinulate  two  chips  in  cascade.  The  results  are 
correct,  indicating  proper  behavior  of  the  specification 
algorithi. 

1  source  listing  for  the  prograa  "values"  appears  in 
figure  3.14,  together  with  a  saaple  run  using  the  problea 
given  abeve.  This  prograa  allows  generation  of  the  aultipSc 
result  gives  any  coibination  of  ain,  bin  and  hin  values 
entered  froa  the  terainal  keyboard. 

2.  E&aiaa  Bala  StssKiaa 

The  reality  of  the  claia  that  flacPitts  designs  are 
"correct  by  construction"  can  be  tested.  The  aultip8c. cif 
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main()  /*  interactive  simultation  of  multlp6c  chip  */ 

{ 

unsigned  int  ain,  bin,  hin,  hout,  lout,  result; 

unsigned  int  testl,  test2,  c; 

printf  ("Type  *C  anytime  to  quit.\n\n") ; 


/*  Loop  until  lnterupt  is  signaled  from  keyboard  */ 
top, 


/* 


Read  input  values  from  keyboard, 
printf  ("Enter  ain...  *); 


scanf  ("%d",  (ain); 
printf  ("Enter  bin...  "); 


scanf  ("%d",  (bin); 
printf  ("Enter  hin...  "); 
scanf  ("%d",  (hin); 


V 


/*  Compute  the  results:  first  Initialize  output  registers.  */ 
lout  *  bln; 
hout  ■  hin; 


/*  Simulate  aultip8c  algorithm.  */ 
for  (c»l;  e<«4;  C++)  { 
testl  *  lout  (  001; 
if  (testl  —  1) 
hout  »  hout  +  ain; 
lout  ■  lout  >>  1; 
test2  •  hout  (  001; 
if  (test2  ■■  1) 
lout  ■  lout  +  128; 
hout  *  hout  >>  1; 


/*  Put  output  reister  values  into  concatenated  decimal  form.  •/ 
result  *  256*hout  +  lout; 


/*  Display  all  values  on  the  screen.  */ 

printf  ("ain«%-4d  bin-%-4d  hin«»-4d  hout»%-4d  lout«*-4d  result«»-5d\n\n" , 
ain, bin, hin, hout, lout, result) ; 


goto  top; 


SAMPLE  RUN 


%  values 

Type  *C  anytime  to  quit. 

Enter  ain...  104 
Enter  bin...  22 
Enter  hin...  0 

ain>104  bin>22  hin«0  hout»39  lout«l  result-9985 

Enter  ain...  104 
Enter  bin...  1 
Enter  hin...  39 

ain»104  bin“l  hin*39  hout«8  lout“240  result»2288 

Enter  ain...  “C» 

« 


Figaro  3.14  V  alaos:  Prograa  to  Coapate  Haltip8c  Oat  pat, 
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file  was  checked  for  design  rule  errors  by  running  it 
through  the  Stanford  "drcN  prograa  via  "ell"  [Bef.  1:  pp. 
147-151]  to  reforaat  the  file.  The  command  sequence  is: 

%  cif  sultip8c.cif  -gng 
%  dl  aultipEc.co 
%  dre  nultip8c.sco 

There  arc  two  problems,  however,  with  using  dre  on  this 
design.  One  is  that  the  design  rules  used  by  BacPittr  are 
not  the  standard  Bead  Conway  rules  [Bef.  2:  pp.  47-51],  but 
are  a  combination  of  these  and  the  HOSIS  design  rules  which 
include  hurried  contacts  [Bef.  2:  page  133].  Burried 

contacts  arc  not  recognized  by  "dre. "  The  other  problem  is 
that  the  "cif"  prograa  does  not  correctly  read  .cif  files 
which  use  the  200  centiaicron  laabda  dimension — round-off 
error  is  introduced.  Therefore,  the  design  rule  check  can 
only  be  perforaed  on  aultip8c5,  not  on  aultip8c4  which  is 
the  versien  to  be  fabricated. 

The  results  of  this  dre  run,  thus  caveated,  produced 
2  types  cf  stated  errors,  both  of  which  are  spurious.  One 
is  a  "pdy  to  diffusion  contact  separation"  error  in  the 
controller  where  aacpitts  abuts  two  contacts,  one  to  poly 
and  one  to  diffusion,  but  both  through  the  same  overlying 
aetal  conductor.  The  intent  of  the  design  rule  checker,  in 
this  instance,  is  tc  forewarn  of  the  possibility  of  a  short 
circuit;  a  short  circuit  is  in  fact  the  desired  result  of 
this  unorthodox  structure.  The  other  stated  error  is  an 
"inplant  surround"  ezror  in  the  register  clock.  This  struc¬ 
ture  is  flagged  because  the  hurried  contact  to  that  layer 
was  ignored  by  dre.  Eased  on  this  non-ideal  but  only  avail¬ 
able  check  of  design  rules,  it  was  concluded  that  the 
aultip8c5.cif  file  does  define  processable  nask  layers.  It 
is  assuaed  that  the  aultip8c4.cif  file  is  also  processable 
because  it  differs  only  in  scale  from  aultip8c5.cif,  except 
for  the  pads,  whose  design  is  supposedly  froa  a  standard 
library  supplied  by  SCSIS. 
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3.  isJe  £jaiastisj  sad  £isal  gusli  tiaa 


The  node  extraction  program  "extract"  which  is  part 
of  the  Stanford  VLSI  design  tools  does  not  accurately  inter¬ 
pret  . cif  files  with  lambda  equal  to  200  centimicrons. 
Fortunately#  the  "nextra"  program#  written  at  Berkeley#2  can 
accommodate  both  200  and  250  centinicron  cif  files. 

lo  obtain  an  extraction  and  simulation  of  the 
multipSc  design  in  4  micron  size#  the  corresponding  cif 
file#  sultip8c4.cif  was  converted  to  the  ".can  foraat  used 
by  the  Berkeley  "caesar"  layout  editor.  Then  labels  for  all 
the  pads  were  added  to  the  design  using  caesar  so  that 
mextxa  would  know  which  nodes  are  to  be  accessible  fer  moni¬ 
toring.  Before  exiting  caesar#  a  new  cif  file#  mul8c.cif, 
is  written  using  the  caesar  command 

:  cif  -p  mul8c. 

The  nede  extraction  is  made  by  issuing  the  command 

5  mextra  auldc. 

The  result  of  the  mextra  run  is  a  .sia  file  suitable  for 
input  to  the  "esim"  event  simulator  [Hef.  Is  pp.  152-155]# 
and  also  a  .log  file  (figure  3.15)  in  which  is  contained 
summary  statistics  of  the  extraction. 


Window:  0  676600  0  602400 
801  depletion 
1612  enhancement 
1398  nodes 


Figure  3.15  Bextra  .log  File  for  Hul8c.cif.. 


2See  Appendix  C 


The  simulation,  using  extracts  of  au!8c.cif  was  set 
up  to  perfora  the  saae  tests  used  in  the  aacpitts  inter¬ 
preter  session  of  aultip8c.  To  do  this,  two  macro  files 
were  created.  One  defines  the  three  phase  clock  sequence, 
declares  which  nodes  to  watch,  and  sets  the  values  of  the 
inputs  to  those  which  siaulate  the  problem  104x22.  The 
second  macro  file,  which  was  designed  to  be  read  in  at  the 
midpcint  of  the  simulation,  redefines  the  input  values  to 
make  tie  chip  perfora  like  the  second  aultip8c  unit  in  the 
pipeline.  These  files  are  both  listed  in  figure  3.16. 


%  cat  mul8c. macro 

K  phia  11011  phlb  10000  phic  10001 
W  ain  ain7  ainfi  alnS  ain4  ain3  ain2  ainl  alnO 

W  bln  bln7  bln6  blnS  bin4  bin3  bin2  binl  binO 

W  bin  hin7  hin6  hin5  hin4  hin4  hln2  binl  hlnO 

W  bout  hout7  ho at 6  houtS  hout4  hout3  hout2  houtl  hoatO 

W  lout  lout7  lout6  loutS  lout 4  lout3  lout2  loutl  loutO 

W  aout  aout7  aout6  aoutS  aout4  aout3  aout2  aoutl  aoutO 

W  clock  phia  phib  phic 
h  ain6  ain5  ain3  bin4  bin2  binl 

1  ain7  ain4  ain2  ainl  ainO  bin7  bin6  bin5  bin3  binO 

1  hin7  hin6  hin5  hin4  hin3  hin2  binl  binO 

%  cat  aul8c.macro2 

h  hin5  hin2  binl  hinO  binO 

1  bln7  bin6  binS  bin4  bin3  bin2  binl 


Figure  3.16  Two  Bacro  Driver  Files  for  Event  Simulation. 

The  record  of  a  simulation  run  using  these  files  is 
contained  in  Appendix  D.  It  shows  the  sane  correct  results 
obtained  with  the  aacpitts  functional  interpreter.  Note 
however  that  when  the  "I"  command  is  given  to  esim,  all  the 
circuit  nodes  are  initialized  to  some  value  over  which  the 
user  has  no  control.  Therefore,  the  values  of  the  output 
ports  are  not  meaningful  until  the  fourth  clock  cycle,  even 
though  they  are  defined  during  initialization. 
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The  event  simulation  result  is  encouraging  evidence 
that  macpitts  can  produce,  in  at  least  one  instance,  a  mask- 
level  description  that  correctly  reflects  a  circuit  design 
with  algorithmic  behavior  specified  by  the  designer. 
Further  validation  evidence  was  obtained  by  performing  an 
extraction  and  event  simulation  on  multip8c5. cif ,  the  5 
micron  version  of  the  multiplier.  This  extraction  could  be 
done  using  the  Stanford  program;  the  result  vas  the  same  as 
for  meitxa.  The  event  simulation  produced  a  correct  result 
for  the  same  exercise.  It  vas  concluded,  therefore,  that 
the  design  vas  ready  for  fabrication. 

F.  SUBSIST  OF  ACTIVITIES  IH  TEE  HACPITTS  DESIGN  CYCLE 

A  recommended  pattern  of  steps  to  follow  in  the  HacPitts 
design  cycle  can  be  summarized  by  presenting  the  sequence  of 
UNIX  commands  issued  by  the  designer  for  a  typical  case. 
This  sequence  divides  into  tvo  paths  after  the  cif  file  is 
created,  depending  on  whether  4  micron  or  5  micron  minimum 
feature  size  is  selected.  For  the  4  micron  option  the 
caesar/mextra  tools  must  be  used.  For  the  5  micron  option 
it  is  more  convenient  to  use  extract,  a  program  which  reccg- 
nizes  node  labels  furnished  by  HacPitts  with  the  cif  user 
extension  0. 

As  a  starting  point,  it  is  assumed  that  the  designer 
already  has  formulated  a  precise  idea  of  what  behavior  the 
chip  is  to  exhibit,  and  has  translated  the  behavioral  speci¬ 
fication  into  BacPitts  language. 

The  5  micron  path,  using  the  multip8c.mac  source  file  as 
an  example,  is  as  follows: 

X  vi  multipSc.mac 

(Create  the  source  file.) 

%  mac pitts  sultip8c  int  herald 


(Ban  the  interpreter  to  debag  the  source  file  and  verify  the 
functional  correctness  of  the  specification.  Save  states  as 
desired  asing  the  "pn  interpreter  command,  renaming  files 
froa  a  second  terainal  keyboard  to  prevent  overwriting. 
Quit  the  interpreter.) 

X  script 

(Start  a  recording  session  for  the  terainal  screen.) 

X  aacpitts  aaltip8c  5a  herald 

(Generate  5  aicron  aultip8.cif  and  complete  design  statis¬ 
tics  . ) 

K  cv  aultip8c.cif  aultip8c5.  cif 
(Benaae  cif  file  to  pzoclaia  that  it  is  a  5  aicron  design.) 

X  ctrl-D 

(Stop  the  recording  session.) 

X  print  typescript 

(Get  bardcopy  of  compiler  statistics  and  heralds.) 

X  cif  aoltip8c5.cif  -gng 
X  ell  aultip6c5.co 
X  drc  aultip8c5.sco 

(Obtain  design  rales  check.) 

X  extract  aultip8c5 

(Obtain  a  node  extract.) 

X  vi  aultip8c5.  sya 

(Change  spelling  of  TDD  and  ground  node  labels  to  Vdd  and 
GHD,  respectively.) 


X  sin  multi p8c5 


(Obtain  the  aal tip 8c 5. sin  file.) 

X  vi  aultipSc.aacrcl 

(Create  one  or  sore  testing  sequence  files  for  the  event 
siaulator.  See  the  "esia"  section  of  Appendix  C  for 
details.) 

X  script 

X  esia  aultip8c5.  sin  aultip8c.aacro1 
(Per fora  event  simulation  of  chip.) 

X  ctrl-D 

X  print  typescript 
X  vi  aultip8c5.cif 

("Coaaent  out"  the  user  extension  0  lines  at  the  beginning 
of  this  file  by  enclosing  thea  all  in  one  set  of  parentheses 
followed  by  a  seaicolon.  See  the  "cifplot"  section  of 
lppendix  C  for  details.) 

X  stipple  aultip8c5.cif  (Obtain  stipple  plot  on  the 
Versatcc  plotter.) 

The  4  aicron  path,  using  the  sane  exanple,  contains 
exactly  the  saae  steps  through  the  interpreter  run,  then 
continues  as  follows: 

X  script 

X  aacpitts  aultipdc  4u  herald 
(Generate  4  aicrcn  aultip8c.cif  and  coaplete  statistics.) 

X  av  aultipEc.cif  aultip8c4.cif 
(Benaae  cif  file  to  proclaia  that  it  is  a  4  aicron  design.) 

X  ctrl-D 

X  print  typescript 


X  ci£2ca  aultip8c4.cif 

(Convert  cif  to  caesar  foraat.  Benign  warnings  are  issued 
when  user  extension  0  lines  are  encountered.) 

X  av  project. ca  aultip8c4. ca 

(Give  the  top  level  caesar  file  a  suitable  naae.) 

51  caesar  aultip8c4 

(Ose  caesar  to  affix  labels  to  each  bonding  pad,  then  output 
a  new  cif  file  using  :  cif  -p  caul8c4.  See  the  "caesar" 
secticn  of  Appendix  C  for  details.  Quit  ceasar. ) 

X  aextra  caul8c4 

(Obtain  a  node  extraction.) 

X  vi  aultipSc.  aacrol 

(Testing  seguence  file(s)  is/ are  identical  to  the  5  Bicron 
case.) 

X  script 

X  esia  caul8c4.sia  aultip8c. aacrol 
(Per  for  a  event  siaulation  of  chip.) 

X  ctrl-D 

X  print  typescript 
X  stipple  cau!8c4.cif 

(Obtain  stipple  plot  on  Versatec.  There  is  no  need  to  worry 
about  user  extension  0  if  the  cif  file  was  created  by 
caesar.) 


IV.  HACPI IIS  PBBFOBHAHCE 


A.  IAICDT  EBBOBS  ABC  IHEFP ICIEHCIES 
1  •  iBSi&SASflgiSI 

Appendix  E  contains  photographs  of  an  A ED  767  color 
graphics  terainal  screen  displaying  the  HacPitts  chip 
layouts  for  each  of  the  six  multipliers  discussed.  The 
presentations  were  generated  by  the  caesar  VLSI  circuit 
editor  [Bef.  6],  Examination  of  these  layouts,  aided  by  the 
zoom-in  feature  of  caesar,  prompts  several  observations 
about  HacPitts*  performance. 

In  any  VLSI  circuit  layout  a  primary  goal  is  to 
cover  the  available  silicon  area  as  densely  as  possible  with 
circuitry.  A  variable,  but  generally  small  amount  of  the 
silicon  area  within  the  bounding  box  of  HacPitts  layouts  is 
covered  uith  circuitry.  This  is  due  in  part  to  the  rigidity 
of  the  target  architecture — requiring  the  layout  of  data 
path  organelles  in  a  strictly  linear  fashion.  The  most 
serious  waste  of  space  in  the  examples  explored,  however,  is 
caused  by  the  inability  of  HacPitts  to  install  bonding  pads 
on  all  four  sides  of  the  chip.  The  left  side  is  never 
available  for  this  purpose  due  to  certain  algorithmic 
simplifications  made  by  the  authors  of  HacPitts  [Bef.  16: 
p.  13].  A  three-sided  arrangement  of  pads  stretches  the 
outline  dimensions,  particularly  in  designs  which  specify  a 
large  number  of  external  connections.  All  of  the  parti¬ 
tioned  multiplier  algorithms  presented  in  the  previous 
chapter— multip8a,  aultip 8b,  and  multip8c — are  in  this 
category. 

Cne  may  consider  the  possibility  of  filling  the 
large  void  above  the  useful  circuitry  in  multip8c4,  for 


exaaple,  with  another  identical  instantiation  of  the 
■ultip6c4  layout,  linns  the  pads,  and  thereby  produce  a 
coaplete  8  bit  soltiplier  on  one  chip.  Sight  pads  for  the 
hin  port  conld  then  also  be  eliminated.  The  cell  soveaent 
and  yank/pat  coaaands  of  caesar  would  sake  this  operation 
possible  with  a  ainiaaa  of  drudgery.  But  the  interconnec¬ 
tions  between  the  2  instantiations  of  the  aultip8c4  nodules 
would  still  reguire  tedious  aanual  layout,  and  would  be  very 
subject  to  huaan  error.  Such  hand  crafting,  ainus  the 
interconnection  aodifications,  was,  in  fact,  atteapted. 
Appendix  £  contains  a  photograph  displaying  the  results  of 
this  effort,  naaed.  aultip8c4d  to  denote  "double."  It 
clearly  deaonstrates  that  the  synergistic  use  of  HacPitts 
with  caesar  is  feasible. 

To  pursue  the  aanual  editing  approach  very  far  would 
be  to  abandon  the  basic  concept  of  silicon  conpilaticn  as 
defined  froa  the  outset.  Nevertheless,  editing  is  reguired 
if  one  is  to  obtain  efficient  use  of  silicon  resources.  The 
appreciation  of  silicon  coapilers  like  MacPitts  still  awaits 
a  future  in  which  tc  perform  such  aanual  editing  is  aore 
costly  (in  custoa  designs  intended  only  for  saall  voluae 
producticn)  than  the  silicon  area  wasted  in  a  suboptiaal 
layout.  One  can  predict  that  that  future  will  arrive,  just 
as  it  did  when  the  cost  of  aeaory  hardware  dropped  thus 
solving  an  analogous  problea:  whether  to  waste  aeaory  but 
write  clear  prograas,  or  conserve  aeaory  fully  at  the  cost 
of  aoauaental  progressing  effort. 

I  lack  of  coapactness  detracts  froa  aore  than 
econoay  cf  producticn,  however.  There  are  penalties  in 
circuit  operating  speed  as  well.  k  closer  look  at  the 
details  of  HacPitts  layouts  reveals  inefficiencies  which 
directly  affect  circuit  perforaance.  In  general,  the  length 
of  aetal  and  polysilicon  interconnections  is  such  longer 
than  the  ainiaua  an  experienced  huaan  layout  artist  would  be 


expected  to  produce ,  even  when  both  are  liaited  to  using 
right~angle  (Manhattan)  layout  rules.  For  example,  all  of 
the  output  data  bits  generated  at  the  far  right  side  of  the 
data  path  aust  be  routed  back  to  the  left  along  the  entire 
length  of  the  data  path,  then  up  (or  dovn) ,  over  to  the 
right  again  for  the  entire  length  of  the  data  path,  and 
finally  down  (or  up)  again  to  reach  the  bonding  pads.  In 
the  nultipfich  layout,  HacPitts  uses  vire  runs  of  up  to  18  an 
to  route  data  bits  frea  their  sources  to  their  bonding  pads 
which,  in  sene  cases  are  less  than  1  an  direct  distance  fron 
the  source.  The  problem  lies  in  the  inability  of  HacPitts 
to  juap  over  the  aetal  power/ground  bus  frame  in  aaking 
connections  froa  the  data  path  to  bonding  pads.  This 


Figure  4.1  Data  Path  Ouput 
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assigning  output  ports  only  to  the  lowest  and  highest 
aunbcred  pins. 

HacPitts,  therefore,  requires  that  the  user  provide 
a  functional  specification  which  is  enlightened  by  knowledge 
of  the  layout  liaitations  if  optiaun  perforaance  is  to  be 
obtained.  This  is  an  area  for.  inprovenent  in  pursuit  of  the 
silicon  coapiler  ideal. 

mother  layout  problea  is  aore  difficult  to  deal 
with:  the  excessive  length  of  wiring  between  the  control 
unit  and  the  data  path.  This  could  be  iaproved  by  centering 
the  control  unit  under  the  data  path,  which  would  reguire 
changing  the  Hacpitts  source  code  in  soae  undeterained  way. 
Is  currently  written,  HacPitts  always  begins  the  control 
unit  at  the  left  nargin. 

There  are  also  aany  instances  of  dead-ended  wires  in 
HacPitts  layouts.  These  "roads  to  nowhere"  occur  when 
HacPitts  extends  runs  beyond  the  last  point  of  interconnec¬ 
tion.  They  occur  aost  frequently  on  the  organelles,  not  all 
of  whose  capabilities  aay  be  used  by  the  behavioral  specifi¬ 
cation  in  a  given  instance.  This  appears  to  be  a  result  of 
an  attenpt  to  use  the  sane  organelle  for  as  aany  different 
applications  as  possible,  apparently  to  control  the  size  of 
the  library.  Ontrixaed  wires  of  this  variety  certainly  add 
to  inter-node  capacitance,  although  not  to  the  extent  that 
inefficient  routing  dees.  Nevertheless,  they  surely  reduce 
the  .  operating  speed  of  the  circuit,  and  nake  operation 
noisier  and  perhaps  less  reliable  at  high  frequencies. 

2.  laaa 

In  addition  to  the  layout  inefficiencies  described, 
there  is  another  problea  with  Hacpitts  layouts.  It  least 
one  input  file  has  been  known  to  produce  a  layout  containing 
a  fatal  error.  Kelly  [Bef.  19]  attenpted  to  use  HacPitts  to 
produce  a  butterfly  switching  elenent  chip.  His  design 


(called  kchip2)  has  a  auch  siapler  data  path  than  the 
aultip8c  pipeline  aultiplier,  bat  it  has  a  larger  control 
unit.  It  also  includes  soae  finite  state  aachine  sequencer 
units  which  serve  the  independent  processes  he  uses  in  the 
design.  These  are  laid  out  to  the  right  of  the  data  path. 
The  BacPitts  designed  layout  of  this  circuit  places  a  direct 
short  circuit  across  the  3  clock  bus  lines.  A  picture  of 
the  portion  of  the  chip  where  the  error  occurs  is  included 
in  Appendix  E.  The  problea  arises  because  the  clock  bus 
contains  "vias"  where  it  aust  be  extended  froa  the  data  path 
to  horizontally  adjacent  eleaents  in  the  design.  These 
"vias"  allow  the  netal  bus  lines  to  cross  vertical  aetal 
fraae  power  or  ground  lines  via  a  brief  transition  to  the 
polysilicon  layer,  then  back  to  the  aetal  layer.  BacPitts, 
however,  apparently  does  not  check  for  the  presence  of  any 
intersecting  vertical  poly  silicon  runs  to  the  control  unit 
which  nay  be  placed  at  the  saae  horizontal  coordinate  as  the 
clock  bus  vias.  Hone  of  the  aultip8  series  of  designs  has 
any  ccntzol  lines  entering  the  extreae  right  end  of  the  data 
path.  Therefore,  the  vias  are  safe,  and  the  problea  dees 
not  occur.  It  is  interesting  to  note,  however,  that 
BacPitts  still  extends  the  clock  bus  well  to  the  right 
beyond  the  point  of  last  use  ,  and  includes  a  dead-end  set 
of  vias  to  juap  over  the  data  path  fraae,  even  though  there 
is  no  need  for  that  extension  in  the  aultipS  faaily  of 
designs.  It  aay  be  concluded  froa  these  observations  that 
this  problea  is  latent  in  all  BacPitts  designs,  and  one 
would  do  well  to  exaaine  the  control  unit  wiring  in  the 
vicinity  of  the  clock  bus  at  the  right  end  of  all  fraaes. 
Caesar  can  be  used,  if  necessary,  to  adjust  the  local  wiring 
slightly  to  route  the  offending  control  line  away  froa  the 
clock  vias. 


8.  0BG1I2112S  ?S-  ST1IDABD  CBIXS 

This  section  briefly  exaaines  sose  comparative  aspects 
of  the  Stanford  standard  cell  approach  used  by  Newkirk  and 
Bathews  £Bef.  20]  and  the  organelles  used  in  HacPitts. 

Beth  standard  cells  and  organelles  are  laid  out  as  bit 
slices.  It  was  hoped  that  there  would  be  a  one-to-one  func¬ 
tional  correspondence  between  at  least  sose  of  the  cata¬ 
logued  standard  cells  and  the  organelles  which  could  fora  a 
basis  for  comparison.  Unfortunately,  there  is  very  little 
functional  correspondence,  let  alone  structural  correspon¬ 
dence,  between  the  two.  The  standard  cells  contain  only 
dynaaic  storage  elements,  and  use  a  2  phase  clock.  The 
HacPitts  organelles  use  a  3  phase  clock,  and  the  only  aeaory 
eleaents  available  are  static  master-slave  flip-flop  regis¬ 
ters.  The  standard  cells  are  designed  for  Batched  pitch. 
That  is,  they  can  be  directly  abutted,  in  aany  cases,  to 
fora  full  length  words  and  arrays.  Organelles,  on  the  ether 
hand,  generally  require  soae  aargin  around  then  for  inter¬ 
connections  (called  "river  routing")  which  apparently  aust 
be  specifically  tailored  for  each  instantiation  of  the 
organelle. 

It  was  hoped  that  at  least  the  HacPitts  adder  organelle, 
which  is  siaply  a  standard  asynchronous  full  adder  made 
entirely  frea  HOB  gates,  could  be  coapared  with  something 
froa  the  standard  cell  library.  The  aost  similar  standard 
cell  in  the  catalogue  is  an  adder/sub tractor  [Bef.  20: 
p.10],  which  is  based  on  the  0H2  arithaetic  logic  unit 
[Bef.  2:  pp.  145-181].  This  cell  is  auch  more  flexible, 
yet  also  aore  specialized,  than  the  HacPitts  adder.  It  is 
capable  of  a  full  range  of  boolean  operations,  not  just 
addition,  as  determined  by  the  values  on  two  4  bit  control 
port  lines  which  are  threaded  through  the  cell.  It  also 
differs  froa  the  organelle  in  that  its  operation  is  clocked. 


Although  a  coa pari sod  based  on  size  hardly  see as  Meaningful 
for  these  two  dissiailar  units#  it  is  noted  that  the  orga¬ 
nelle  a ensures  250  z  40  laahda  units  using  aeasureaents 
taken  fzca  actual  layout  plots.  The  standard  cell  adder 
aeasuzes  211  x  32  laahda  units  as  specified  in  £Bef.  20:  p. 
11]. 

The  BacPitts  static  register  organelle  has  no  functional 
parallel  in  the  standard  cell  library  for  the  reasons 
aentioned  above.  It  neasures  64  z  30  lanbda  units# 
excluding  the  clock  buffer  unit  which  contains  a  load  enable 
line  affecting  all  the  bits  in  the  saae  register.  The  stan¬ 
dard  cell  dynanic  shift  register  bit  neasures  88  x  24  laahda 
units#  and  contains  a  selector  input  line  for  each  bit  of 
the  register  built  free  these  cells. 

C.  SCFT11HE  INCOHP111BILXT IBS 

The  authors  of  BacPitts  have  extended  the  CIF  language 
to  aake  "0"  at  the  beginning  of  a  line  indicate  that  the 
rest  of  the  line  contains  the  coordinates  of  a  node#  the 
nask  layer  to  which  it  applies#  and  a  label  nane  for  that 
node.  This  is  a  useful  feature  with  the  Stanford  node 
extraction  prograas  which  recognize  this  label  device  and 
use  it  autoaatically  to  aake  the  node  accessible  to  siaula- 
tion  prograas  siaply  by  calling  its  nane*  This  extension  of 
CIF  is  unknown  to  the  Berkeley  71SI  tools.  The  latter  use 
another  CIF  extension-- "94"— to  flag  node  labels. 


T.  CO  SC  10  SIOp 


1.  S0HH1BI 

This  thesis  has  described  silicon  coapilers,  and  deaon- 
strated  hoe  the  HacPitts  silicon  coapiler  can  be  eaployed  to 
design  a  digital  pipelined  aaltiplier  using  a  partitioning 
concept. 

Shcrtcoaings  of  this  silicon  coapiler  have  been  found 
which  aake  the  results  produced  by  it  inferior  in  soae  ways 
to  those  produced  by  practiced  designers.  These  shortcoa- 
ings  nay  be  outweighed,  for  soae  applications,  by  the  reduc¬ 
tion  in  design  tine.  The  functional  correctness  of  the 
HacPitts  aultiplier  design  has  been  denonstrated  to  the 
extent  allowed  by  available  siaulation  tools.  Other 
HacPitts  designs  nay  contain  errors  which  can  be  edited  out 
with  relative  ease. 

The  user  of  HacPitts  can  affect  the  output  of  the  coapi- 
laticn  pzocess  in  two  aeaningful  ways.  First,  it  nay  be 
possible  to  write  the  behavioral  specification  algoritha  to 
allow  partitioning  of  the  design  aaong  more  than  one  chip. 
This  possibility  shculd  be  explored  when  layout  size  is  a 
problea.  Second,  proper  assignaent  of  pins  can  reduce  the 
worst-case  length  of  pin  pad  wiring. 

Hacpitts  has  been  found  coapatible,  except  in  a  few 
cases,  with  other  VLSI  design  tools  at  NPS.  The  caesar  VIST 
editor  has  been  particularly  useful,  along  with  the  cifplot 
stipple  plotter,  in  gaining  insight  into  the  processes 
eaplcyed  by  HacPitts  in  producing  a  layout. 

Although  the  final  aultiplier  design  was  subaitted  for 
fabrication,  unexpected  delays  in  production  schedules 
precluded  testing  the  finished  product  as  part  of  this 
research. 


B.  B ICOBBIBDATZOIS 


Ihe  following  recommendations  should  be  considered: 

1.  lest  the  aultiplier  chips,  when  they  becoae  available, 
using  the  event  siaulation  macros  and  as  aany  other  input 
coabinations  as  facilities  allow.  Single-cycle  testing 
should  be  dene  before  dynaaic  testing  is  undertaken  using  a 
direct  aeaory  access  tester. 

2.  Dissect  BacPitts  designs  with  caesar,  saving  in  separate 
cif  files  useful  syafcols  to  add  to  the  local  VLSI  library. 
Syabcls  such  as  pad  fraaes  or  entire  data  path  units  aay  be 
of  interest. 

3.  Brite  new  organelles  for  the  BacPitts  library.  &  carry- 
look-ahead  adder  would  be  a  useful  addition. 

4.  Enlarge  the  capabilities  of  MacPitts  to  produce  designs 
in  a  CMOS  technology.  This  would  involve  not  only  writing 
new  data  path  organelles,  but  modifying  the  control  unit 
architecture,  as  well. 

5.  Obtain  a  capability  locally  to  handle  file  transfers 
over  the  4BPAHET/BI18IT  system. 


itmm  4 

IBSTAILATIOI  OF  1ACPITTS  01  VAX-11/780  UNDER  UHIX  4.1  AND 

4.2 


A.  I1STAX1ATX01  U1D21  OHXX  4.1  OPBBAXIHG  SISTEH 

HacPitts  is  distributed  as  a  collection  of  discrete 
source  cede  files  written  in  the  "C"  programming  language 
and  in  Frans  Lisp  Opes  38.  Also  included  in  this  distribu¬ 
tion  are  tvo  library  files  containing  the  bonding  pad 
layouts  in  CIF,  and  a  library  file  containing  the  standard 
organelles.  The  complete  list  of  files  is  given  in  table  II 
These  files  are  located  in  the  directory  /vlsi/macpit  under 
cvnership  of  vlsi. 

All  of  the  operations  necessary  to  build  mac pitts  are 
seguenced  by  the  "Makefile, "  a  feature  of  the  UNIX  operating 
system  that  directs  the  automatic  compilation  and  assembly 
of  source  programs  tc  produce  large  software  modules. 

Building  an  executable  version  of  the  maepitts  program 
reguires  that  each  source  file  be  first  compiled  by  the 
"liszt"  lisp  compiler  or  the  "cc"  compiler,  as  appropriate. 
The  pads.l  file  is  a  lisp  source  which  is  actully  generated 
by  another  lisp  source.  The  latter  source,  padgen.l, 
filters  the  bonding  pad  CIF  information  contained  in  the 
rinout  and  pads20  files,  and  produces  pads.l,  a  list  of 
bonding  pad  information  in  the  standard  syntax  of  Franz 
Lisp.  Eads.l  is  then  "liszt'ed"  (compiled)  to  produce  the 
pads.o  object  file.  The  next  step  of  the  process  fast-loads 
all  of  the  compiled  object  files,  linking  them  together  in  a 
single  lisp  "environment."  Finally,  the  default  settings 
for  all  the  maepitts  options  invoked  at  run  time  are  over- 
layed.  It  is  this  linked  lisp  environment,  with  the 


81 


TABLE  II 

HacPitts  Source  Files 


Hakefile  -  a  Makefile  used  to  build 

the  coaplete  Hacpitts  system 

15.1  -  layout  language  used  by  oacpitts 

to  generate  CIF 

-  next  13  files  are  the  lisp 
source  code  for  HacPitts 

ccntrol.l 

data-path. 1  -  has  built-in  organelles 

def  streets.  1 
extract.l 
flags. 1 

frame. 1  -  layout  of  obi  file  starts  here 

front-page. 1 

general.! 

interpret. 1  -  interactive  interpreter 

order.l  r 

pads.l  -  created  during  "make  maepitts" 
prepass. 1  -  execution  starts  here 

padgen.l  -  makes  pads.l  from  next  2  files 
rinout  -  Stanford  Cell  Library  pads 
pad2Cb  -  HOSIS  2.0  micron  pads 

library  -  standard  macro,  function,  test, 

-  and  organelle  library 

organelles.  1  -  compiled  portion  of  organelle  library 

linccln.l  -  the  Lincoln  Laboratory  lisp  environment 
c-routines.c  -  interfaces  to  operating  system 

maepitts  -  dumped  Hacpitts  environment 


defaults  set,  vhich  is  finally  dumped  as  the  binary  execu¬ 
table  module:  maepitts.  To  repeat:  this  entire  process  is 
performed  automatically  by  the  Hakefile. 

Because  this  dusped  lisp  environment  embodies  all  the 
built  in  functions  of  Franz  Lisp,  as  veil  as  the  functions 
of  maepitts,  it  contains  a  very  large  number  of  lisp  func¬ 
tions.  To  accommodate  all  these  functions,  the  Franz  lisp 
compiler  must  be  done  over  vith  new  values  for  the  parame¬ 
ters  HAIFHS  and  TBIHTS  vhich  set  the  maximum  number  of 
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functions  and  function  table  entries  allowable.  Also,  the 
padgen.l  file  uses  tie  "untyi"  function  of  the  Franz  Lisp 
Opus  38  fast  loader  which  peraits  insertion  of  a  single 
character  in  the  inpot  buffer  string.  The  "untyi"  is  not  a 
part  of  the  Franz  lisp  Opus  36  source  supplied  with  UNIX 
4.1.  Therefore,  when  Franz  Lisp  is  reaade  with  the  new 
BAXFNS  and  1HEHTS  values,  the  "untyi"  function  must  he  added 
to  the  fast  loader  scarce  code.  The  steps  to  accoaplish  a 
reaake  of  Franz  Lisp  are  as  follows: 

•  In  the  file  /usr/src/cad/lisp/franz/sysat.c  add  the 
follwing  line  to  the  group  of  HK  declarations: 
EK(*untyi*,  Luntyi,  laabda)  ; 

•  In  the  file  /usrysrc/cad/lisp/franz/h/lf uncs.h  add  the 
following  line  to  the  group  of  lispval  declarations: 
lispval  Luntyi  () ; 

•  in  the  file  /usr/src/cad/lisp/franz/laa6.c 
append  the  following  code  segaent: 

lispval 
Luntyi 0 
C 

lispval  port,ch; 
port  *  nil; 
switch  (np-lbct)  { 
case  2:  port  *  lbot[  1  ].  val 
case  1:  ch  *  lict£0].  val; 
break; 
default: 

argerr  (*  untyi*) ; 

) 

if  (TYPE  (ch)  l»  IHT  } 

errorh (Yeraisc,  "untyi:  expects  fixnua  character", 
nil, False,)  ,ch)  ; 


) 

an  g  etc  ( (int)  ch->i,  okport  (port,  okport  (7piport->a.  clb 
stain) ) )  j 
return  (ch)  ; 

) 

•  In  the  file  /usr/src/cad/lisp/franz/nfasl.c 
change  the  value  of  HA1FNS  to  10000. 

•  In  the  file  /usr/src/cad/lisp. franz/h/structs. b 
change  the  value  cf  TRENT S  to  1024. 

•  Do  a  Naake  all”  from  the  directory . /usr/src/cad/lisp. 

Franz  lisp  is  now  ready  to  coapile  MacPitts.  The  next  step 
is  to  correct  and  aodify  the  source  code  for  Nacpitts  itself 

•  In  the  file  /vlsi/nacpit/c~routines. c  add  these 
lines  at  the  beginning: 

# define  7PRINT  0100 
idefine  VPLOT  0200 
idefine  VPRINTP10T  04  00 
idefine  VGETST1TE  ((»v»«8)|0 
idefine  7SETST11E  ((»v*«8)|1 

•  In  the  sane  file  add  the  following  lines  after  line  188 

static  int  plotad[ ]  *  7EIOT,0,0  ; 

static  int  prtad[  ]  =  7PHINT,  0, 0  ; 

•  In  the  saae  file  change  line  199  to: 

ioctl  (plotter,  7SET STATE,  plotad)  ; 

•  In  the  saae  file  change  line  207  to: 

ioctl  (plot ter,7SETSTATE,prtnd)  ; 

•  In  the  file  /vlsi/aacpit/Nakefile  change  line  5  to: 

HacPitts  *  /vlsi/eacpit/bin/aacpitts 

•  In  the  saae  file  change  line  83  to: 

(lead  interpret.  1)\ 


•  In  the  sane  file  change  line  84  to: 

(setg  nacpitts-directory  '/▼lsi/nacpit)\. 

•  In  the  sane  file  change  line  87  to: 

(setg  option  list  •  (opt-d  opt-c  stat  obj  cif  nologo) )\ 

•  In  the  sane  file  change  line  94  to; 

nv  nacpitts  S(BacPitts) 

•  In  the  file  /vlsi/nacpit/interpret.l  change  line  18  to 

(setg  library  (get-litrary  )} 

•  In  the  file  /vlsi/nacpit/lincoln.l  change  line  1093  to 

(cfasl  *  | /vlsi/nacpit/c- routines. o  -lcurses  -lterncapl 

After  naicing  these  changes,  nacpitts  is  ready  to  "nake." 
Type  "nake  nacpitts."  All  the  files  vill  be  conpiled, 
linked,  loaded,  and  then  damped  as  a  conplete  nacpitts  lisp 
envircnnent.  This  takes  aboat  45  ninates  on  a  lightly 
loaded  systen.  Heit  type  "nake  install."  This  connand 
sinply  neves  the  dunped  executable  nodale  into  the  directory 
/vlsi/nacpit/bin.  Bov  type  "nake  clean"  to  renove  all  the 
lisp  cbject  files  that  are  no  longer  needed.  The  size  of 
the  nacpitts  executable  nodale  is  1384704  bytes.  Finally, 
any  user  of  nacpitts  should  add  the  directory  /vlsi/nacpit/ 
bin  to  the  path  list  in  the  .login  file  in  his  hene 
directory. 

B.  I1STA11ATIOI  HIDES  UUX  4.2  OPBBATIHG  SISTEH 

The  nacpitts  generated  on  a  UNIX  4.1  systen  vill  net  run 
under  OBIZ  4.2.  This  is  because  the  systen  calls  are 
different.  The  version  of  Franz  Lisp  supplied  vith  ORIX  4.2 
is  OPOS  38,  vhich  already  includes  the  "untyi"  function. 
Therefore  it  is  necessary  to  nodify  the  sysat.c, 

lfuncs.h,  cr  lan6.c  files.  It  necessary,  hovever,  to 


increase  the  MAXFHS  and  TEENTS  values  just  as  in  the  case  of 
a  OHIX  4.1  installation.  For  4.2  these  paraaeters  are  found 
in  the  files  /usr/src/ucb/lisp/franz/fasl.c  and  /usr/src/ 
ucb/lisp/franz/h/structs.h,  respectively.  After  asking 
these  two  changes,  change  directories  to  /usr/src/ucb/lisp, 
enter  super-user,  and  issue  the  cossand  "lispeonf ."  This 
starts  up  an  interactive  prograa  which  allows  you  to  specify 
the  type  of  aachine  cs  which  Franz  Lisp  is  being  installed. 
The  answers  to  the  guestions  posed  by  this  script  will  be 
obvious  if  you  are  csing  a  TAX  coaputer  running  UNIX  4.2. 
Bert  issue  "sake  fast"  froa  the  saae  directory  and  the  lisp 
systea  will  be  generated.  This  step  takes  about  2  hours  on 
a  lightly  loaded  aachine.  After  this  is  done,  issue  naake 
install"  to  aove  the  files  into  the  standard  systea  directo¬ 
ries. 

The  4.2  operating  systea  also  contains  another  bug  that 
will  prevent  the  naepitts  interpreter  froa  running.  In  the 
file  /usr/src/usr.lib/libtera/tputs. c  change  OSPEED  to 
TOSPEED  everywhere  it  occurs.  Then  recoapile  tputs.c  This 
is  to  avoid  aultiple  definition  of  OSPEED  in  this  file  and 
in  another  file,  /usr/src/usr. lib/libcurses/cr_tty.c. 

The  aodificatione  to  the  BacPitts  source  code  itself  are 
the  saae  as  those  reguired  for  a  OHIX  4.1  installation,  with 
the  following  exception  and  addition: 

•  In  the  file  /vlsi/aacpit/Bakefile  it  is  not  necessary 
to  change  line  83.  This  line  should  reaain: 

(fasl  *  interpret) 

•  Opus  38  of  Franz  Lisp,  unlike  Opus  36,  coaplains  if 
paraaeters  declared  in  a  functional  definition  are 
not  used  in  the  definition  itself.  The  BacPitts 
source  code  contains  an  instance  of  this  aalpractice. 
Therefore,  in  the  file  /vlsi/aacpit/f raae.l 

eh*  jig e  line  1338  to: 


The  process  of  "sake  nacpitts"  is  done  the  saae  as  for 
OHIX  <4.1,  bat  the  results  are  soaevhat  different.  Franz 
lisp  issues  earnings  during  coapilation  whenever  an  expres¬ 
sion  is  encountered  which  does  not  have  the  proper  nuaber  of 
paraaeters  iaaediately  available.  These  warnings  occur 
freguently  when  aac pitts  is  aade  under  OHIX  4.2.  This 
happens  because  the  aacpitts  source  code  is  contained  in 
aany  separate  files,  each  of  which  nay  have  external  refer¬ 
ences  that  reaain  unresolved  until  the  object  nodules  are 
all  leaded  and  linked  together.  These  warnings  have  no 
effect  on  the  quality  of  aacpitts  produced,  but  their 
delivery  does  consuae  epu  tine.  As  a  result,  it  takes 
approxiaately  90  ninotes  to  "nake  aacpitts"  under  OHIX  4.1. 
The  final  Hacpitts  executable  is  1567888  bytes  long  in  Opus 
38  on  4.2.  Finally,  reaeaber  to  add  the  /vlsi/aacpit/bin 
directory  to  the  path  list  in  the  .login  file  in  your  hcae 
directory. 


nmsu  s 

IHSI111ATICH  OF  THE  CAESAR  VLSX  EDITOR  ORDER  OVXX  4.1  AHD 

4.2 


1.  IESTA11AZIOR  ORDER  ORIX  4. 1 

The  caesar  VLSI  circuit  editor  is  one  of  many  programs 
contained  in  the  distribution  of  1983  VLSI  C. A.D.  tools  from 
0.  C.  Berkeley.  The  distribution  tape  is  loaded,  in  its 
entirety,  in  the  directory  /vlsi/berk83  under  ownership  of 
vlsi.  Before  installing  the  tools,  perfora  the  following: 

1.  Have  the  systea  programmers  create  a  new  user, 
"sleeper,"  with  password  "caesar,"  and  hone  directory 
/vlsi/berk83/bin.  Create  a  ".login"  file  in  /vlsi/ 
berk83/bin  which  consists  of  only  the  following  two 
lines: 

sleeper 

logout 

This  step  allows  the  use  of  a  graphics  tablet  to  posi¬ 
tion  the  cursor  in  caesar,  an  iaportant  facility. 

2.  Have  the  systea  prograaaer  create  another  new  user, 
"cad"  with  the  password  close-held,  and  home  directory 
/vlsi/berk83.  This  step  resolves  the  aany  references 
to  "~cad"  which  are  scatered  throughout  the  distribu¬ 
tion  tape. 

3.  Xn  the  file  /vlsi/berk83/nan/tnac.anc  replace  every 
occurrence  of  the  string  ~cad  with  the  string  /vlsi/ 
berk 8 3. 

4.  Edit  the  file  /vlsi/berk83/lib/displays  tc  contain 
only  the  following  one  line: 

/dev/tty22  /dev/tty20  std  AED767 
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5.  Edit  the  file  /vlsi/berk83/src/caesar/config.c  to 
replace  every  occurrence  of  the  string  cad  with  the 
string  /vlsi/berk83. 

6.  in  the  file  /vlsi/berk83/src/caesar/aain.c  find  the 
single  "return"  statement  in  the  procedure 
"OnCcaaand."  Jo st  before  that  stateaent,  add  a  line 
containing  the  stateaent  "GrFlush  () ;  ". 

7.  In  the  file  /vlsi/ber k 83/s rc/aa kevhatis.es h  reaove  the 
string  "aan4"  frea  line  8. 

Hew  proceed  with  the  installation  by  issuing  the 
following  coaaands.  Allow  each  coaaand  to  run  to  coapletion 
before  issuing  the  next.  Coapletion  is  indicated  by  the 
return  of  the  systea  prompt,  "J." 
cd  / v lsi/b erk 8  3/sr c/caesar 
aake 

aw  caesar  /w Is i/berk8 3/bin/caesar 
ra  *.o 
Cd  «  • 

sre/aakewhatis.esh 

This  ccapletes  the  installation  of  caesar,  aeztra,  cadaan, 
and  cif2ca.  There  are  other  programs  on  this  distribution 
for  which  the  foregoing  procedure  should  have  also  been 
sufficient  to  achieve  a  satisfactory  installation,  but  these 
reaain  untested. 

Finally,  any  user  of  these  tools  should  add  the  direc¬ 
tory  /vlsi/berk8 3/bin  to  the  path  list  in  the  .login  file  of 
his  hcae  directory. 

B.  I1STA11ATI0I  0HD1B  1BE  Dili  4.2  OPEfiATIHG  SISTEH 

The  Unix  4.2  operating  systea  uses  tiaing  and  interrupt 
calls  which  differ  significantly  froa  those  used  by  Onix 
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4.1.  Therefore,  because  caesar  lakes  extensive  use  of  these 
calls,  the  tool  as  installed  for  4.1  will  not  run  under  4.2. 
1  different  distribution  tape  has  been  written  for  the 
Berkeley  1983  design  tools  under  OHIX  4.2.  Installation  of 
this  distribution  proceeds  in  the  sane  way  as  the  4.1 
distribution  except  that  step  6  is  unnecessary.  The  bug 
that  this  step  corrects  has  already  been  corrected  on  the 
4.2  distribution  tape. 

It  is  also  necessary  to  change  a  line  which  occurs  in 
five  files  in  the  directory  /vlsi/berk83/src/caesar 
frca  iinclude  <tiae.  h> 
to  #include  <sys/tiae.h> 

The  five  files  affected  are  aain. c,  aed4.c,  oaega4.c, 
raatek4.c  and  vect4.c. 

low  proceed  with  the  installation  by  issuing  the 
following  ccaaands: 

.cd  /vlsi/berk83/src/ caesar 
sake 

av  caesar  /vlsi/berk8 3/ tin/ caesar 
ra  *.o 
cd  •• 

src/aakewhatis.csh 

Finally,  add  the  directory  /vlsi/berk 83/bin  to  the  path 
list  in  the  .login  file  in  your  hone  directory. 


HU  Oil  P1GES  70S  BEBKE1BY  DESIGH  TOOLS 

In  online  operator’s  annual  exists  for  all  of  the  VLSI 
design  tools  in  the  1983  distribution  froa  Berkeley. 
Indorsation  on  the  use  of  any  of  these  can  be  aade  to  appear 
on  the  terainal  screen  by  issuing 

cadnan  <prograa> 

where  <prograa>  can  be  cadaan,  caesar,  cif2ca,  cifplot, 
esia,  aextra,  or  any  of  the  other  prograas  in  that  distribu~ 
tion.  Only  those  pages  affecting  tools  used  in  this  silicon 
coapiler  research  are  reproduced  in  this  appendix. 

Hote  that  the  cadnan  progran  is  contained  in  the  direc- 

tory 

/vlsi /berk  8  3/bin 

Therefore  either  include  this  directory  in  the  search  path 
of  your  ".login”  file  or  invoke  cadaan  by  the  full  rooted 
coaaand: 


/vlsi/berk83/bin/cadaan  <prograa> 
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NAME 

cadman  -  run  off  section  of  UNIX  manual 
SYNOPSIS 

cadman  [  -  ]  [  -t  ]  [  section  ]  title  ... 

DESCRIPTION 

Cadman  is  a  program  which  prints  sections  of  the  cad  manual. 
Section  is  an  optional  arabic  section  number,  i.e.  3,  which 
may  be  followed  by  a  single  letter  classifier,  i.e.  lm  indi¬ 
cating  a  maintenance  type  program  in  section  1.  It  may  also 
be  "cad'',  "new*',  "junk*1,  or  "public''.  If  a  section 
specifier  is  given  cadman  looks  in  the  that  section  of  the 
cad  manual  for  the  given  titles.  If  section  is  omitted,  cad¬ 
man  searches  all  sections  of  the  cad  manual,  giving  prefer¬ 
ence  to  commands  over  subroutines  in  system  libraries,  and 
printing  the  first  section  it  finds,  if  any. 

If  the  standard  output  is  a  teletype,  or  if  the  flag  -  is 
given,  then  cadman  pipes  its  output  through  ssp(l)  to  crush 
out  useless  blank  lines,  ul,(l)  to  create  proper  underlines 
for  different  terminals,  and  through  more (1)  to  stop  after 
each  page  on  the  screen.  Hit  a  carriage  return  to  continue, 
a  control-D  to  scroll  12  more  lines  when  the  output  stops. 

The  -t  flag  causes  cadman  to  arrange  for  the  specified  sec¬ 
tion  to  be  troff'ed  to  the  Versatec. 


PILES 

"cad/doc/cadman/man?/* 


SEE  ALSO 

Programmer's  manual:  more(l),  ul(l),  ssp(l) ,  man(l),  appro- 
pos(l) 


BUGS 

The  manual  is  supposed  to  te  reproducible  either  on  the  pho¬ 
totypesetter  or  on  a  typewriter.  However,  on  a  typewriter 
some  information  is  necessarily  lost. 
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NAME 

caesar  -  VLSI  circuit  editor 
SYNOPSIS 

caesar  [  -n  -g  graphics_port  -t  tablet  port  -p  path  -m 
monitor_type  ~d  display“type  ]  [  file  7 

DESCRIPTION 

Caesar  is  an  interactive  system  for  editing  VLSI  circuits  at 
the  level  of  mask  geometries.  It  uses  a  variety  of  color 
displays  with  a  bit  pad  as  well  as  a  standard  text  terminal. 
For  a  complete  description  and  tutorial  introduction,  see 
the  user  manual  "Editing  VLSI  Circuits  with  Caesar"  (an  on¬ 
line  copy  is  in  "cad/doc/caesar . tblms) . 

Command  line  switches  are: 

-n  Execute  in  non-interactive  mode. 

-g  The  next  argument  is  the  name  of  the  port  to  use  for 

communication  with  the  graphics  display.  If  not  speci¬ 
fied,  Caesar  makes  an  educated  guess  based  on  the  ter¬ 
minal  from  which  it  is  being  run. 

-t  The  next  argument  is  the  name  of  the  port  to  use  for 
reading  Information  from  the  graphics  tablet.  If  not 
specified,  Caesar  makes  an  educated  guess  (usually  the 
graphics  port) . 

-p  The  next  argument  is  a  search  path  to  be  used  when 
opening  files. 

-m  The  next  argument  is  the  type  of  color  monitor  being 

used,  and  is  used  to  select  the  right  color  map  for  the 
monitor's  phosphors,  "std"  works  well  for  most  moni¬ 
tors,  "pale"  is  for  monitors  with  especially  pale  blue 
phosphor. 

-d  The  next  argument  is  the  type  of  display  controller 

being  used.  Among  the  display  types  currently  under¬ 
stood  are:  AED512,  UCB512  (the  AED512  with  special 
Berkeley  PROMs  for  stippling) ,  AED767,  AED640  (an 
AED767  configured  as  483x640  pixels),  0mega440,  R9400, 
or  Veetrix. 

When  Caesar  starts  up  it  looks  for  a  command  file  with  the 
name  ".caesar"  in  the  home  directory  and  processes  it  if  it 
exists.  Then  Caesar  looks  for  a  .caesar  file  in  the  current 
directory  and  reads  it  as  a  command  file  if  it  exists.  The 
.caesar  file  format  is  described  under  the  long  command 
source. 
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You  generally  h.sve  to  log  in  on  the  color  terminal  under  the 
name  "sleeper"  {password  "caesar").  This  is  necessary  in 
order  for  the  tablet  to  be  useable.  Sleeper  can  be  killed 
by  typing  two  control-backslashes  in  quick  succession  on  the 
color  display  keyboard  (on  the  AED  displays,  control- 
backslash  is  gotten  by  typing  control-shift-L. ) 


The  four  buttons  on  the  graphics  tablet  puck  are  used  in  the 
following  way: 

left  (white)  (#2) 

Move  the  box  so  that  its  fixed  corner  (normally  lower- 
left)  coincides  with  the  crosshair  position. 

right  (green)  (#4) 

Move  the  box's  variable  corner  (normally  upper-right) 
to  coincide  with  the  crosshair  position.  The  fixed 
corner  is  not  moved. 

top  (yellow)  (#1) 

Find  the  cell  containing  the  crosshair  whose  lower-left 
corner  is  closest  to  the  crosshair.  Make  that  cell  the 
current  cell.  If  the  button  is  depressed  again  without 
moving  the  crosshair,  the  parent  of  the  current  cell  is 
made  the  current  cell. 

bottom  (blue)  (#3) 

Paint  the  area  of  the  box  with  the  mask  layers  under¬ 
neath  the  crosshair.  If  there  are  no  mask  layers  visi¬ 
ble  underneath  the  crosshair,  erase  the  area  of  the 
box. 


SHORT  COMMANDS 

Short  commands  are  invoked  by  typing  a  single  let-er  on  the 
keyboard.  Valid  commands  are: 


a  Yank  the  information  underneath  the  box  into  the  yank 
buffer.  Only  yank  the  mask  layers  present  under  the 
crosshair  (if  there  are  no  mask  layers  underneath  the 
crosshair,  yank  all  mask  layers  and  labels). 

c  Unexpand  current  cell  (display  in  bounding  box  form). 

d  Delete  paint  underneath  the  box  in  the  mask  layers 

underneath  the  crosshair  (if  there  are  no  mask  layers 
underneath  the  crosshair,  the  delete  labels  and  all 
mask  layers) . 

e  Move  the  box  up  1  lambda. 
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g  Toggle  grid  on/off. 

1  Redisplay  the  information  on  both  text  and  graphics 
screens . 

q  Move  the  box  left  1  lambda. 

r  Move  the  box  down  1  lambda. 

s  Put  back  (stuff)  all  the  information  in  the  yank  buffer 
at  the  current  box  location.  Stuff  only  information  in 
mask  layers  that  are  present  underneath  the  crosshair 
(if  there  are  no  mask  layers  underneath  the  crosshair, 
stuff  all  mask  layers  plus  labels) . 

u  Undo  the  last  change  to  the  layout. 

w  Move  the  box  right  one  lambda. 

x  Unexpand  all  cells  that  intersect  the  box  but  don't 
contain  it. 

z  Zoom  in  so  that  the  area  underneath  the  box  fills  the 
screen. 

C  Expand  current  cell  so  that  its  paint  and  children  can 
be  seen. 

X  Expand  all  cells  that  intersect  the  box,  recursively, 
until  there  are  no  unexpended  cells  intersecting  the 
box. 

Z  Zoom  out  so  that  everything  on  current  screen  fills  the 
area  underneath  the  box. 

5  Move  the  picture  so  that  the  fixed  corner  of  the  box  is 
in  the  center  of  the  screen. 

6  Move  the  picture  so  that  the  variable  corner  of  the  box 
is  in  the  center  of  the  screen. 

*L  Redisplay  the  graphics  and  text  displays. 

.  Repeat  the  last  long  command. 


LONG  COMMANDS 

Long  commands  are  invoked  by  typing  a  colon  character  (":"). 
The  cursor  will  appear  on  the  bottom  line  of  the  text  termi¬ 
nal.  A  line  containing  a  command  name  and  parameters  should 
be  typed,  terminated  by  return.  Each  line  may  consist  of 
multiple  commands  separated  by  semi-colons  (to  use  a  colon 
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as  part  o£  a  long  command,  precede  it  with  a  backslash) . 

Short  commands  may  be  invoked  in  long  command  format  by 

preceding  the  short  command  letter  with  a  single  quote. 

Unambiguous  abbreviations  for  command  names  and  parameters 

are  accepted.  The  commands  are: 

align  <scale> 

Change  crosshair  alignment  to  <scale>.  Crosshair  posi¬ 
tion  will  be  rounded  off  to  nearest  multiple  of 
<scale>. 

array  <xsize>  <ysize> 

Make  the  current  cell  into  an  array  with  <xsize> 
instances  in  the  x-direction  and  <ysize>  instances  in 
the  y-direction.  The  spacing  between  elements  is 
determined  by  the  box  x-  and  y-dimensions. 

array  <xbot>  <ybot>  <xtop>  <ytop> 

Make  the  current  cell  into  an  array,  numbered  from 
<xbot>  to  <xtop>  in  the  x-direction  and  from  <ybot>  to 
<ytop>  in  the  y-direction.  The  spacing  between  array 
elements  is  determined  by  the  box  x-  and  y-dimensions. 

box  <keyword>  <amount> 

Change  the  box  by  <amount>  lambda  units,  according  to 
<keyword>.  If  <keyword>  is  one  of  "left",  "right", 
"up",  or  "down",  the  whole  box  is  moved  the  indicated 
amount  in  the  indicated  direction.  If  <keyword>  is  one 
of  "xbot",  "ybot",  "xtop",  or  "ytop",  then  one  of  the 
coordinates  of  the  box  is  adjusted  by  the  given  amount. 
<amount>  may  be  either  positive  or  negative. 

button  <number>  <x>  <y> 

Simulate  the  pressing  of  button  <number>  at  the  screen 
location  given  by  <x>  and  <y>  (in  pixels).  If  <x>  and 
<y>  are  omitted,  the  current  crosshair  position  is 
used. 

cif  -sblpx  <name>  <scale> 

Write  out  a  CIF  description  of  the  layout  into  file 
<name>  (use  edit  cell  name  by  default;  a  ".cif"  exten¬ 
sion  is  supplied  by  default).  <scale>  indicates  how 
many  centimicrons  to  use  per  Caesar  unit  (200  by 
default).  The  -s  switch  causes  no  silicon  (paint)  to 
be  output  to  the  CIF  file.  The  -b  switch  causes  bound¬ 
ing  boxes  to  be  drawn  for  unexpanded  cells,  ihe  -1 
causes  labels  to  be  output.  The  -p  switch  causes  a  CIF 
point  to  be  generated  for  each  label.  The  -x  switch 
causes  Caesar  not  to  automatically  expand  all  cells 
(they  are  expanded  by  default) . 

cload  <file> 
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Load  the  colormap  from  <file>.  The  monitor  type  is 
used  as  default  extension. 

clockwise  <degrees>  [y] 

Rotate  the  current  cell  by  the  largest  multiple  of  90 
degrees  less  than  or  equal  to  <degrees>.  <degrees> 
defaults  to  90.  If  the  command  is  followed  by  a  "y" 
then  the  yank  buffer  is  rotated  instead  of  the  current 
cell . 

colormap  <layers> 

Print  out  the  red,  green,  and  blue  intensities  associ¬ 
ated  with  <layers>. 

colormap  <layers>  <red>  <green>  <blue> 

Set  the  intensities  associated  with  <layers>  to  the 
given  values. 

copycell 

Make  a  copy  of  the  current  cell,  and  position  it  so 
that  its  lower-left  corner  coincides  with  the  lower- 
left  corner  of  the  box. 

csave  <file> 

Save  the  current  colormap  in  <file>  (the  monitor  type 
is  used  as  default  extension) . 

deletecell 

Delete  the  current  cell, 
editcell  <file> 

Edit  the  cell  hierarchy  rooted  at  <file>.  A  *.ca* 
extension  is  supplied  by  default.  If  information  in 
the  current  hierarchy  has  changed,  you  are  given  a 
chance  to  write  it  out. 

erasepaint  <layers> 

For  the  area  enclosed  by  the  box,  erase  all  paint  in 
<layers>.  If  <layers>  is  omitted  it  defaults  to  "*1*. 

fill  <direction>  <layers> 

<direction>  is  one  of  "left",  "right",  "up",  or  "down". 
The  paint  under  one  edge  of  the  box  (respectively,  the 
right,  left,  bottom,  or  top  edge)  is  sampled?  every¬ 
where  that  the  edge  touches  paint,  the  paint  is 
extended  in  the  given  direction  to  the  opposite  side  of 
the  box.  <layers>  selects  which  layers  to  fill;  if 
omitted  then  a  default  of  •*"  is  used. 

flushcell 

Remove  the  definition  of  the  current  definition  from 
main  memory  and  reload  it  from  the  disk  version.  Any 
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changes  to  the  cell  since  it  was  last  written  are  lost, 
getcell  <£ile> 

This  command  makes  an  instance  of  the  cell  in  <file>  (a 
*.ca"  extension  is  supplied  by  default)  and  positions 
that  instance  at  the  current  box  location.  The  box 
size  is  changed  to  equal  the  bounding  box  of  the  cell. 

gridspacing 

The  grid  is  modified  so  that  its  spacings  in  x  and  y 
equal  the  dimensions  of  the  box.  The  grid  is  set  so 
that  the  box  falls  on  grid  points. 


gripe 

The  mail  program  is  run  so  that  comments  can  be  sent  to 
the  Caesar  maintainer. 

height  <size> 

The  box's  height  is  set  to  <size>.  If  <size>  is  pre¬ 
ceded  by  a  plus  sign  then  the  fixed  corner  is  moved  to 
set  the  correct  height;  otherwise  the  variable  corner 
is  moved.  <size>  defaults  to  2. 

identifycell  <name> 

The  current  cell  is  tagged  with  the  instance  name  given 
by  <name>.  This  feature  is  not  currently  supported  in 
any  useful  fashion.  <name>  may  not  contain  any  white 
space. 

label  <name>  <position> 

A  rectangular  label  is  placed  at  the  box  location  and 
tagged  with  <name>.  <name>  may  not  contain  any  white 
space.  <position>  is  one  of  "center",  "left",  "right", 
"top",  or  "bottom";  it  specifies  where  the  text  is  to 
be  displayed  relative  to  the  rectangle.  If  omitted, 
<position>  defaults  to  "top”. 

lyra  <ruleset> 

The  program  ~cad/bin/lyra  is  run,  and  is  passed  via 
pipe  all  the  mask  features  within  3L  of  the  box.  The 
program  returns  labels  identifying  design  rule  viola¬ 
tions,  and  these  are  added  to  the  edit  cell.  If 
<ruleset>  is  specified,  it  is  passed  to  Lyra  with  the 
-r  switch  to  indicate  a  specific  ruleset.  Otherwise, 
the  current  technology  is  used  as  the  ruleset. 

macro  <character>  <command> 

The  given  long  command  is  associated  with  the  given 
character,  such  that  whenever  the  character  is  typed  as 
a  short  command  then  the  given  command  is  executed. 

This  overrides  any  existing  definition  for  the  charac¬ 
ter.  To  clear  a  macro  definition,  type  ":macro 
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<character>",  and  to  clear  all  macro  definitions,  type 

• tmacro" 

mark  <markl>  <mark2> 

The  box  is  saved  in  the  mark  given  by  <markl>.  <markl> 
must  be  a  lower-case  letter.  If  <mark2>  is  specified, 
the  box  is  changed  to  coincide  with  <mark2>. 

movecell  <keyword> 

The  current  cell  is  moved  in  one  of  two  ways,  selected 
by  <keyword>.  If  <keyword>  is  "byposition" ,  then  the 
cell  is  moved  so  that  its  lower-left  corner  coincides 
with  the  lower-left  corner  of  the  box.  This  also  hap¬ 
pens  if  no  keyword  is  specified.  If  <keyword>  is 
"bysize",  then  the  cell  is  displaced  by  the  size  of  the 
box  (this  means  that  what  used  to  be  at  the  fixed 
corner  of  the  box  will  now  be  at  the  variable  corner)  . 

paint  <layers> 

The  area  underneath  the  box  is  painted  in  <layers>. 
path  <path> 

The  string  given  by  <path>  becomes  the  search  path  used 
during  file  lookups.  <path>  consists  of  directory 
names  separated  by  colons  or  spaces.  Each  name  should 
end  in  V". 

peek  <layers> 

Display  all  paint  underneath  the  box  belonging  to 
<layers>,  even  for  unexpanded  cells  and  their  descen¬ 
dants. 

popbox  <mark> 

If  <mark>  is  specified,  then  the  box  is  replaced  with 
the  given  mark.  Otherwise  the  box  stack  is  popped  and 
the  top  stack  element  overwrites  the  box. 

pushbox  <mark> 

The  box  is  pushed  onto  the  box  stack.  If  <mark>  is 
specified  then  it  is  used  to  overwrite  the  box,  other¬ 
wise  the  box  remains  unchanged. 

put  <layers> 

The  yank  buffer  information  in  <layers>  is  copied  back 
to  the  box  location.  If  <layers>  is  omitted,  it 
defaults  to  "*S1". 

quit  If  any  cells  have  changed  since  they  were  last  saved  on 
disk,  the  user  is  given  a  chance  to  write  them  out  or 
abort  the  command.  Otherwise  the  program  returns  to 
the  shell. 
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reset 

The  graphics  display  is  reinitialized  and  the  colormap 
is  reloaded. 

return 

The  current  subedit  is  left,  and  the  containing  edit  is 
resumed. 

savecell  <name> 

If  <name>  is  specified  then  the  current  cell  is  given 
that  name  and  written  to  disk  under  the  name  (a  *.ca" 
extension  is  supplied  by  default).  If  <file>  isn't 
specified  then  the  cell  is  written  out  to  the  disk  file 
from  which  it  was  read. 

scroll  <direction>  <amount>  <units> 

The  current  view  is  moved  in  the  indicated  direction  by 
the  indicated  amount.  <direction>  must  be  one  of 
"left",  "right",  "up",  or  "down",  <araount>  is  a 
floating-point  number,  and  <units>  is  one  of  "screens” 
or  "lambda".  <units>  defaults  to  "screens”,  and 
<amount>  defaults  to  0.5. 

search  <regexp> 

Search  labels  and  bounding  boxes  underneath  the  box  for 
text  matching  <regexp>.  See  the  manual  entry  for  ed 
for  a  description  of  <regexp>.  Push  an  entry  onto  the 
box  stack  for  each  match*  Even  unexpanded  cells  are 
searched . 

sideways  [y] 

Flip  the  current  cell  sideways  (i.e.  about  a  vertical 
axis) .  if  the  command  is  followed  by  a  "y"  then  the 
yank  buffer  is  flipped  instead  of  the  current  cell. 

source  <filename> 

The  given  file  is  read,  and  each  line  is  processed  as 
one  long  command  (no  colons  are  necessary).  Any  line 
whose  last  character  is  backslash  is  joined  to  the  fol¬ 
lowing  line. 

subedit 

Make  the  current  cell  the  edit  cell,  and  edit  it  in 
context. 

technology  <file> 

Load  technology  information  from  <file>.  A  ".tech" 
extension  is  supplied  by  default. 

upsidedown  [y] 

Flip  the  current  cell  upside  down.  If  the  command  is 
followed  by  a  "y*  then  the  yank  buffer  is  flipped 
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Instead  of  the  current  cell, 
usage  <file> 

Write  out  in  <file>  the  names  of  all  the  files  contain¬ 
ing  cell  definitions  used  anywhere  in  the  design 
hierarchy. 

view  <mark> 

If  <mark>  is  specified,  set  view  to  it,  otherwise, 
change  the  view  to  encompass  the  entire  edit  cell. 

visiblelayers  <layers> 

Set  the  visible  layers  to  include  just  <layers>.  Pre¬ 
face  <layers>  with  a  plus  or  minus  sign  to  add  to  or 
remove  from  the  currently  visible  ones. 

width  <size> 

Set  the  box  width  to  <size>  (default  is  2) .  Move  vari¬ 
able  corner  unless  width  is  preceded  by  *+",  else  move 
fixed  corner. 

writeall 

Run  through  interactive  script  to  write  out  all  cells 
that  have  been  modified. 

yank  <layers> 

Save  in  the  yank  buffer  all  information  underneath  the 
box  in  <layers>.  <layers>  defaults  to  "*1". 

ycell  <name> 

If  <name>  is  specified,  do  the  equivalent  of  "igetcell 
<name>*.  Then  expand  current  cell,  yank  it,  delete  the 
cell,  and  put  back  everything  that  was  yanked.  This 
flattens  the  hierarchy  by  one  level. 

ysave  <name> 

Save  the  yank  buffer  contents  in  a  cell  named  <name>.  A 
*.ca*  extension  is  provided  by  default. 


LAYERS 


nMOS  mask  layers  are: 
p  or  r 

Polysilicon  (red)  layer, 
d  or  g 

Diffusion  (green)  layer, 
m  Metal  (blue)  layer. 
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Implant  (yellow)  layer, 
b  Buried  contact  (brown)  layer, 
c  Contact  cut  layer, 

o  Overglass  hole  (gray)  layer. 

e  Error  layer:  used  by  design  rule  checkers  and  other 

programs. 

CMOS  P-well  mask  layers  are  (using  technology  cmos-pw) : 
p  or  r 

Polysilicon  (red)  layer, 
d  or  g 

Diffusion  (green)  layer, 
m  Metal  (blue)  layer, 

c  Contact  cut  layer. 

P  or  y 

P+  implant  (pale  yellow)  layer, 
w  P-well  (brown  stipple)  layer, 

o  Overglass  hole  (gray)  layer. 

e  Error  layer:  used  by  design  rule  checkers  and  other 
programs. 

Predefined  system  layers  are: 

*  All  mask  layers. 

1  Label  layer. 

S  Subcell  layer. 

C  Cursor  layer. 

G  Grid  layer. 

B  Background  layer. 

SYSTEM  MARKS 

C  The  bounding  box  of  the  current  cell. 

S  The  bounding  box  of  the  edit  cell. 
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P  The  previous  view. 

R  The  bounding  box  of  the  root  cell. 
V  The  current  view. 

PILES 

"cad/new/caesar ,  “cad /doc /caesar . tblms 


SEE  ALSO 

cif2ca(l) 
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NAME 

ci£2ca  -  convert  CIF  files  to  CAESAR  files 
SYNOPSIS 

cif2ca  [  -1  lambda  J  [  -t  tech  ]  [  -o  offset  ]  ciffile 
DESCRIPTION 

clf2ca  accepts  as  input  a  CIF  file  and  produces  a  CAESAR 
file  for  each  defined  symbol.  Specifying  the  -1  lambda 
option  scales  the  output  to  lambda  centi-microns  per  lambda. 
The  default  scale  is  200  centi-microns  per  lambda.  The  -t 
tech  option  causes  layers  from  the  specified  technology  to 
be  acceptable.  The  default  technology  is  nmos.  For  a  list 
of  acceptable  technologies,  see  caesar  (1).  The  -o  offset 
option  causes  all  CIF  numbers  to  be  incremented  by  offset. 
This  is  useful  when  the  CIF  numbers  are  used  for  Caesar  file 
names,  and  when  several  CIF  files  with  overlapping  numbers 
are  to  be  joined  together  in  Caesar. 

Each  symbol  defined  in  the  CIF  file  creates  a  CAESAR  file. 

By  default,  the  files  are  named  ' 'symbolm. ca' ' ,  where  m  is 
the  CIF  symbol  number  (as  modified  by  the  -o  offset) .  Sym¬ 
bols  can  also  be  named  with  a  user-extension  ' ' $ 4 1  command, 
giving  a  name  to  the  symbol  definition  which  encloses  it. 

CIF  commands  which  appear  outside  of  symbol  definitions  are 
gathered  into  a  symbol  called,  by  default,  "project'',  and 
are  output  to  the  CAESAR  file  "pro  ject.ca’ ' . 

SEE  ALSO 

caesar (1) 

DIAGNOSTICS 

Diagnostics  from  clf2ca  are  supposed  to  be  self-explanatory. 
Each  diagnostic  g Ives'  the  line  number  from  the  input  file, 
an  error  class  (informational,  warning,  fatal,  or  panic), 
the  error  message,  and  the  action  taken  by  cif2ca,  usually 
to  ignore  the  CIF  command.  Informational  messages  usually 
refer  to  limitations  of  cif2ca.  Warning  messages  usually 
refer  to  Inconsistencies  in  the  CIF  file,  these  will  typi¬ 
cally  result  in  CAESAR  files  which  do  not  accurately  reflect 
the  input  CIF  file.  Fatal  messages  refer  to  fatal  incon¬ 
sistencies  or  errors  in  the  CIF  file.  A  fatal  error  ter¬ 
minates  cif 2ca  processing.  Panic  messages  refer  to  internal 
problems  with  cif2ca.  If  any  diagnostics  are  produced,  a 
summary  of  the  diagnostics  is  produced. 


BUGS 

"Delete  Definitions'*  commands  are  not  implemented.  cif2ca 
also  has  certain  restrictions  due  to  restrictions  of  cAESAkj 
£.£.  non-manhattan  objects  are  not  allowed. 
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Library  cells  are  not  automagically  included. 

Some  care  should  be  taken  in  naming  symbols,  since  symbol 
names  are  used  for  CAESAR  file  names.  Names  which  are  not 
unique  in  the  first  14  characters  will  attempt  to  create  the 
same  CAESAR  file,  and  only  the  last  one  wins.  Similarly, 
one  should  avoid  trying  to  have  two  project.ca  files  in  the 
same  directory. 
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NAME 

cifplot  -  CIF  interpreter  and  plotter 
SYNOPSIS 

cifplot  t  options  ]  file1*®**  t  file2.cif  ...  ] 

DESCRIPTION 

Cifplot  takes  a  description  in  Cal-Tech  Intermediate  Form 
(CIF1)  and  produces  a  plot.  CIF  is  a  low-level  graphics 
language  suitable  for  describing  integrated  circuit  layouts. 
Although  CIF  can  be  used  for  other  graphics  applications, 
for  ease  of  discussion  it  will  be  assumed  that  CIF  is  used 
to  describe  integrated  circuit  designs.  Cifplot  interprets 
any  legal  CIF  2.0  description  including  symbol  reaming  and 
Delete  Definition  commands.  In  addition,  a  number  of  local 
extensions  have  been  added  to  CIF,  including  text  on  plots 
and  include  files.  These  are  discussed  later.  Care  has 
been  taken  to  avoid  any  arbitrary  restrictions  on  the  CIF 
programs  that  can  be  plotted. 

To  get  a  plot  call  cifplot  with  the  name  of  the  CIF  file  to 
be  plotted.  If  the  CIF  description  is  divided  among  several 
files  call  cifplot  with  the  names  of  all  files  to  be  used. 
Cifplot  reads  the  CIF  description  from  the  files  in  the 
order  that  they  appear  on  the  command  line.  Therefore  the 
CIF  End  command  should  be  only  in  the  last  file  since  cif¬ 
plot  ignores  everything  after  the  End  command.  After  read- 
ing  the  CIF  description  but  before  plotting,  cifplot  will 
print  a  estimate  of  the  size  of  the  plot  and  then  ask  if  it 
should  continue  to  produce  a  plot.  Type  y  to  proceed  and  n 
to  abort.  A  typical  run  might  look  as  follows: 

%  cifplot  lib.cif  sorter. cif 
Window  -5700  174000  -76500  168900 
Scale:  1  micron  is  0.004075  inches 
The  plot  will  be  0.610833  feet 
Do  you  want  a  plot?  y 

After  typing  y  cifplot  will  produce  a  plot  on  the  Benson- 
Varian  (11  inch  Versatec)  plotter. 

Cifplot  recognizes  several  command  line  options.  These  can 
be  used  to  change  the  size  and  scale  of  the  plot,  change 
default  plot  options,  and  to  select  the  output  device. 
Several  options  may  be  selected.  A  dash(-)  must  precede 
each  option  specifier.  The  following  is  a  list  of  options 
that  may  be  included  on  the  command  line: 

-w  xmin  xmax  ymin  ymax 

(window)  The  -w  options  specifies  the  window;  by 
default  the  window  is  set  to  be  large  enough  to  contain 
the  entire  plot.  The  windowing  commands  lets  you  plot 


"Y 
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just  a  small  section  of  your  chip,  enabling  you  to  see 
it  in  batter  detail.  Xm  i  n ,  xmax,  ymin,  and  ymax  should 
be 'specified  in  CIF  coordinates. 

-s  float 

(scale)  The  -s  option  sets  the  scale  of  the  plot.  By 
default  the  scale  is  set  so  that  the  window  will  fill 
the  whole  page.  Float  is  a  floating  point  number 
specifying  the  number  of  inches  which  represents  1 
micron.  A  recommended  size  is  0.02. 

-1  layer  list 

(layer)  Normally  all  layers  are  pllotted.  The  -1  option 
specifies  which  layers  NOT  to  plot.  The  layer  list 
consists  of  the  layer  names  separated  by  commas,  no 
spaces.  There  are  some  reserved  names:  allText,  bbox, 
outline,  text,  pointName,  and  symbolName.  Including 
the  layer  name  allText  in  the  list  suppresses  the  plot¬ 
ting  of  text;  bbox  suppresses  the  bounding  box  around 
symbols,  outline  suppresses  the  thin  outline  that 
borders  each  layer.  The  Keywords  text,  pointName,  and 
symbolName  suppress  the  plotting  of  certain  text 
created  by  local  extension  commands,  text  eliminates 
text  created  by  user  extension  2.  pointName  eliminates 
text  created  by  user  extension  94.  symbolName  elim¬ 
inates  text  created  by  user  extension  9.  allText, 
pointName,  and  symbolName  may  be  abbreviated  by  at,  pn, 
and  sn  repectively. 


(copies)  Makes  n  copies  of  the  plot.  Works  only  for 
the  Varian  and  Versatec.  Default  is  1  copy. 


(depth)  This  option  lets  you  limit  the  amount  of  detail 
plotted  in  a  hierarchically  designed  chip.  It  will 
only  instanciate  the  plot  down  £  levels  of  calls. 
Sometimes  too  much  detail  can  hide  important  features 
in  a  circuit. 


-g  n 

"(grid)  Draw  a  grid  over  the  plot  with  spacing  every  n_ 
CIF  units. 

-h  (half)  Plot  at  half  normal  resolution.  (Not  yet  imple¬ 
mented  . ) 

-e  (extensions)  Accept  only  standard  CIF.  User  extensions 
produce  warnings. 

-I  (non-Interactive)  Do  not  ask  for  confirmation.  Always 
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-L  (List)  Produce  a  listing  of  the  CIF  file  on  standard 

output  as  it  is  parsed.  Not  recommended  unless  debug¬ 
ging  hand-coded  CIF  since  CIF  code  can  be  rather  long. 


-a  n 

“(approximate)  Approximate  a  roundflash  with  an  ji-sided 
polygon.  By  default  n  equals  8.  (I.e.  roundflashes 

are  approximated  by  octagons.)  If  ri  equals  0  then  out¬ 
put  circles  for  roundflashes.  (It  is  best  not  to  use 
full  circles  since  they  significantly  slow  down  plot¬ 
ting.)  (Full  circles  not  yet  implemented. ) 

-b  "text" 

(banner)  Print  the  text  at  the  top  of  the  plot. 

-C  (Comments)  Treat  comments  as  though  they  were  spaces. 
Sometimes  CIF  files  created  at  other  universities  will 
have  several  errors  due  to  syntactically  incorrect  com¬ 
ments.  (I.e.  the  comments  may  appear  in  the  middle  of 
a  CIF  command  or  the  comment  does  not  end  with  a  semi¬ 
colon.)  Of  course,  CIF  files  should  not  have  any  errors 
and  these  comment  related  errors  must  be  fixed  before 
transmitting  the  file  for  fabrication.  But  many  times 
fixing  these  errors  seems  to  be  more  trouble  than  it  is 
worth,  especially  if  you  just  want  to  get  a  plot.  This 
option  is  useful  in  getting  rid  of  many  of  these  com¬ 
ment  related  syntax  errors. 

-r  (rotate)  Rotate  the  plot  90  degrees. 

-V  (Varian)  Send  output  to  the  varian.  (This  is  the 

default  option.) 

-W  (Wide)  Send  output  directly  to  the  versatec.  (Not 
available  at  NPS.) 

-S  (Spool)  Store  the  output  in  a  temporary  file  then  dump 
the  output  quickly  onto  the  Versatec.  Makes  nice  crisp 
plots;  also  takes  up  a  lot  of  disk  space. 

-T  (Terminal)  Send  output  to  the  terminal.  (Not  yet  fully 
implemented. ) 

-Gh 

-Ga  (Graphics  terminal)  Send  output  to  terminal  using  it's 
graphics  capablities.  -Gh  indicates  that  the  terminal 
is  an  HP2648.  -Ga  indicates  that  the  terminal  is  an 
AED  512. 

-X  basename 

(extractor)  From  the  CIF  file  create  a  circuit 
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description  suitable  for  switch  level  simulation.  It 
creates  two  files:  basename . sim  vh.  contains  the  cir¬ 
cuit  description,  and  basename. node  which  contains  the 
node  numbers  and  their  location  used  in  the  circuit 
description. 

When  this  option  is  invoiced  no  plot  is  made.  Therefore 
it  is  advisable  not  to  use  any  of  the  other  options 
that  deal  only  with  plotting.  However,  the  window, 
layer,  and  approximate  options  are  still  appropriate. 

To  get  a  plot  of  the  circuit  with  the  node  numbers  call 
cifplot  again,  without  the  -X  option,  and  include 
basename. nodes  in  the  list  of  CIF  files  to  be  plotted. 
(This  file  must  appear  in  the  list  of  files  before  the 
file  with  the  CIF  End  command.) 


-c  n 

(copies)  The  -c  specifies  the  number  of  copies  of  the 
plot  you  would  like.  This  allows  you  to  get  many  copies 
of  a  plot  with  no  extra  computation. 

-P  pattern  file 

(Pattern)  The  -P  option  lets  you  specify  your  own 
layers  and  stipple  patterns.  Pattern  file  may  contain 
an  arbitrary  number  of  layer  descriptors.  A  layer 
descriptor  is  the  layer  name  in  double  quotes,  followed 
by  8  integers.  Each  integer  specifies  32  bits  where 
ones  are  black  and  zeroes  are  white.  Thus  the  8 
integers  specify  a  32  by  8  bit  stipple  pattern.  The 
integers  may  be  in  decimal,  octal,  or  hex.  Hex  numbers 
start  with  'Ox';  octal  numbers  start  with  'O'.  The  CIF 
syntax  requires  that  layer  names  be  made  up  of  only 
uppercase  letters  and  digits,  and  not  longer  than  four 
characters.  The  following  is  example  of  a  layer 
description  for  poly^silicon: 

•NP"  0x08080808  0x04040404  0x02020202  0x01010101 

0x80808080  0x40404040  0x20202020  0x10101010 

-F  font  file 

(Font)  The  -F  option  indicates  which  font  you  want  for 
your  text.  The  file  must  be  in  the  directory 
1 /usr/lib/vfont 1 .  The  default  font  is  Roman  6  point. 
Obviously,  this  option  is  only  useful  if  you  have  text 
on  your  plot. 

-0  filename 

(Output)  After  parsing  the  CIF  files,  store  an 
equivalent  but  easy  to  parse  CIF  description  in  the 
specified  file.  This  option  removes  the  include  and 
array  commands  (see  next  section)  and  replaces  them 
with  equivalent  standard  CIF  statements.  The  resulting 
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file  is  suitable  for  transmission  to  other  facilities 
for  fabrication. 

In  the  definition  of  CIF  provisions  were  made  for  local 
extensions.  All  extension  commands  begin  with  a  number. 

Part  of  the  purpose  of  these  extensions  is  to  test  what 
features  would  be  suitable  to  include  as  part  of  the  stan¬ 
dard  language.  But  it  is  important  to  realize  that  these 
extensions  are  not  standard  CIF  and  that  many  programs 
interpreting  CIF  do  not  recognize  them.  If  you  use  these 
extensions  it  is  advisable  to  create  another  CIF  file  using 
the  -0  options  described  above  before  submitting  your  cir¬ 
cuit  for  fabrication.  The  following  is  a  list  of  extensions 
recognized  by  cif plot . 

01  filename; 

(Include)  Read  from  the  specified  file  as  though  it 
appeared  in  place  of  this  command.  Include  files  can 
be  nested  up  to  6  deep. 

0A  s  m  n  dx_  d^  ; 

"(Array)  Repeat  symbol  £  m  times  with  dx  spacing  in  the 
x-direction  and  n  times  with  d£  spacing  in  the  y- 
direction.  s,  m7  and  ti  are  unsigned  integers.  £x  and 
dy  are  signed  integers"“in  CIF  units. 

1  message; 

(Print)  Print  out  the  message  on  standard  output  when 
it  is  read. 

2  "text"  transform  ; 

2C  "text"  transform  ; 

(Text  on  Plot)  Text  is  placed  on  the  plot  at  the  posi¬ 
tion  specified  by  the  transformation.  The  allowed 
transformations  are  the  same  as  the  those  allowed  for 
the  Call  command.  The  transformation  affects  only  the 
point  at  which  the  beginning  of  the  text  is  to  appear. 
The  text  is  always  plotted  horizontally,  thus  the  mir¬ 
ror  and  rotate  transformations  are  not  really  of  much 
use.  Normally  text  is  placed  above  and  to  the  right  of 
the  reference  point.  The  2C  command  centers  the  text 
about  the  reference  point. 

9  name ; 

(Name  symbol)  name  is  associated  with  the  current  sym¬ 
bol. 

94  name  £  £; 

94  name  ic  ^  layer; 

(Name  point)  name  is  associated  with  the  point  (x,  v ) . 
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Any  mask  geometry  crossing  this  point  is  also  associ¬ 
ated  with  name.  If  layer  is  present  then  just  geometry 
crossing  the  point  on  that  layer  is  associated  with 
name.  For  plotting  this  command  is  similar  to  text  on 
plot.  When  doing  circuit  extraction  this  command  is 
used  to  give  an  explicit  name  to  a  node.  Name  must  not 
have  any  spaces  in  it,  and  it  should  not  be  a  number. 

USE  WITH  MAC PITTS  CIF 

The  lines  starting  with  user  extension  0,  which  MacPitts 
places  at  the  beginning  of  every  CIF  file,  must  either  be 
removed  or  "commented  out"  by  enclosing  them  in  an  all- 
encompassing  set  of  parentheses,  thus:  ■"( 

MacPitts  C IF  files  are  usually  very  long.  It  has  been  found 
most  convenient  to  run  MacPitts  cifplots  in  the  background 
with  the  non-Interactive  mode  selected.  A  convenient  way  to 
do  this  is  by  using  the  "stipple"  command: 
stipple  filel.cif 


FILES 

“cad/. cadre 
“/.cadre 

“cad/bin/vdump  (only  in  4.1  BSD  UNIX) 

“cad/bin/s tipple 
/usr/1 ib/vfont/R. 6 
/usr/trap/#cif * 

ALSO  SEE 

mcp(cadl),  vdump(cadl),  cadre (cad5) 

A  Guide  to  LSI  Implementation  by  Hon  and  Sequin,  Second  Edi¬ 
tion  (Xerox  PARC,  1980)  for  a  description  of  CIF. 


BUGS 

The  -r  is  somewhat  kludgy  and  does  not  work  well  with  the 
other  options.  Space  before  semi-colons  in  local  extensions 
can  cause  syntax  errors. 

The  -0  option  produces  simple  cif  with  no  scale  factors  in 
the  DS  commands.  Because  of  this  you  must  supply  a  scale 
factor  to  some  programs,  such  as  the  -1  option  to  cif 2ca . 
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NAME 

esim  -  event  driven  switch  level  simulator 
SYNOPSIS 

esim  (f ilel  [Cile2  ...] ] 

DESCRIPTION 

Esim  is  an  event-driven  switch  level  simulator  for  NMOS 
transistor  circuits.  Es im  accepts  commands  from  the  user, 
executing  each  command  before  reading  the  next.  Commands 
come  in  two  flavors:  those  which  manipulate  the  electrical 
network,  and  those  to  direct  the  simulation.  Commands  have 
the  following  simple  syntax: 

c  argl  arg2  ...  argn  <newline> 
where  'c'  is  a  single  letter  specifying  the  command  to  be 
performed  and  the  arqi  are  arguments  to  that  command.  The 
arguments  are  separated  by  spaces  (or  tabs}  and  the  command 
is  terminated  by  a  <newline>. 

To  run  esim  type 

esim  filel  file2  ... 

Esim  will  read  and  execute  commands,  first  from  filel,  then 
f ile2,  etc.  If  one  of  the  file  names  is  preceded  by  a 
then  that  file  becomes  the  new  output  file  (the  default  out¬ 
put  is  3tdout) .  For  example, 
esim  f.sim  -f.out  g.sim 

This  would  cause  esim  to  read  commands  from  f.sim,  sending 
output  to  the  default  output.  When  f.sim  was  exhausted, 

f. out  would  become  the  new  output  fiTe,  and  the  commands  in 

g. sim  executed. 

After  all  the  files  have  been  processed,  and  if  the  "q"  com¬ 
mand  has  not  terminated  the  simulation  run,  esim  will  accept 
further  commands  from  the  user,  prompting  for  each  one  like 
so: 

sim> 

The  user  can  type  individual  commands  or  direct  esim  to 
another  file  using  the  "9"  command: 
sim>  9  patchf ile.sim 

This  command  would  cause  esim  to  read  commands  from 
"patchf ile.sim* ,  returning  to  interactive  input  when  the 
file  was  exhausted. 

It  is  common  to  have  an  initial  network  file  prepared  by  a 
node  extractor  with  perhaps  a  patch  file  or  two  prepared  by 
hand.  After  reading  these  files  into  the  simulator,  the 
user  would  then  interactively  direct  esim.  This  could  be 
accomplished  as  follows: 

esim  file.sim  patch. 1  patch. 2 
After  reading  the  files,  esim  would  prompt  for  the  first 
command.  Or  we  could  have  typed: 

%  esim  file.sim 
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sim>  9  patch. 1 
sim>  8  patch. 2 

Network  Manipulation  Commands 

The  electrical  network  to  be  simulated  is  made  up  of 
enhancement  and  depletion  mode  transistors  interconnected  by 
nodes.  Components  can  be  added  to  the  network  with  the  fol¬ 
lowing  commands: 

e  gate  source  drain 

e  gate  source  drain  length  width  key  xpos  ypos  area 

Adds  enhancement  mode  transistor  to  network  with 
the  specified  gate,  source,  and  drain  nodes.  The 
longer  form  includes  size  and  location  information 
as  provided  by  the  node  extractor  —  when  making 
patches  the  short  form  is  usually  used, 
d  gate  source  drain 

d  gate  source  drain  length  width  key  xpos  ypos  area 
Like  ”6"  except  for  depletion  mode  devices. 

C  nodel  node2  cap 

Increase  the  capictance  between  nodel  and  node2  by 
cap.  Esim  ignores  this  unless  either  nodel  or 
node2  Is <SND. 

*  node  namel  name2  name3 

Allows  the  user  to  specify  synonyms  for  a  given 
node.  Used  by  the  node  extractor  to  relate  user- 
provided  node  names  to  the  node’s  internal  name 
(usually  just  a  number) . 

I  comment... 

Lines  beginning  with  vertical  bar  are  treated  as 
comments  and  ignored  —  useful  for  deleting  pieces 
of  network  in  node  extractor  output  files, 
i  node 

Input  record  —  output  by  node  extractor  and  not 
used  by  esim. 

Currently,  there  is  no  way  to  remove  components  from  the 
network  once  they  have  been  added.  You  must  go  back  the 
input  files  and  modify  them  (using  the  comment  character)  to 
exclude  those  components  you  wished  removed.  "N*  records 
need  not  be  included  for  new  nodes  the  user  wishes  to  patch 
into  the  network. 

Simulator  Commands 

The  user  can  specify  which  nodes  are  to  have  there  values 
displayed  after  each  simulation  step: 
w  nodel  -node2  node3  ... 

Watch  nodel  and  node3,  stop  watching  node2.  At 
the  end  of  a  simulation  step,  each  watched  node 
will  displayed  like  so: 
nodel»0  node3*X  ... 

To  remove  a  node  from  the  watched  list,  preface 
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its  name  with  a  in  a  "w"  command. 

W  label  nodel  node2  ...  noden 

Watch  bit  vector.  The  values  of  nodes  nodel,  ... , 
noden  will  displayed  as  a  bit  vector: 
label-010100  20 

where  the  first  0  is  the  value  of  nodel,  the  first 
1  the  value  of  node2,  etc.  The  number  displayed 
to  right  is  the  value  of  the  bit  vector  inter¬ 
preted  as  a  binary  number;  this  is  omitted  if  the 
vector  contains  an  X  value.  There  is  no  way  to 
unwatch  a  bit  vector. 

Before  each  simulation  step  the  user  can  force  nodes  to  be 
either  high  (1)  or  low  (0)  inputs  (an  input's  value  cannot 
be  changed  by  the  simulatorl): 
h  nodel  node2  .. 

Force  each  node  on  the  argument  list  to  be  a  high 
input,  overrides  previous  input  commands  if 
necessary. 

1  nodel  node2  ... 

Like  "h"  except  forces  nodes  to  be  a  low  input, 
x  nodel  node2  ... 

Removes  nodes  from  whatever  input  list  they  happen 
to  be  on.  The  next  simulation  step  will  determine 
their  correct  value  in  the  circuit.  This  is  the 
default  state  of  most  nodes.  Mote  that  this  does 
not  force  nodes  to  have  an  "X*  value  —  it  simply 
removes  them  from  the  input  lists. 

The  current  value  of  a  node  can  be  determined  in  several 
ways: 

v 

View,  prints  the  values  of  all  watched  nodes  and 
nodes  on  the  high  and  low  input  lists. 

?  nodel  node2  ... 

Prints  a  synopsis  of  the  named  nodes  including 
their  current  values  and  the  state  of  all  transis¬ 
tors  that  affect  the  value  of  these  nodes.  This 
is  the  most  common  way  of  wondering  through  the 
network  in  search  of  what  went  wrong... 

!  nodel  node2  ... 

For  each  node  in  the  argument  list,  prints  a  list 
of  transistors  controlled  by  that  node. 

■?"  and  " ! *  allow  the  user  to  go  both  backwards  and  forwards 
through  the  network  in  search  of  that  piece  causing  all  the 
problems. 

The  simulator  is  invoked  with  the  following  commands: 
s 

Simulation  step.  Propogates  new  values  for  the 
inputs  through  the  network,  returns  when  the  net¬ 
work  has  settled.  If  things  don't  settle,  command 
will  never  terminate  —  try  the  "w"  and  "D"  com¬ 
mands  to  narrow  down  the  problem. 
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c 

Cycle  once  through  the  clock,  as  define  by  the  K 
command. 

I 

Initialize.  Circuits  with  state  are  often  hard  to 
initialize  because  the  initial  value  of  each  node 
is  X.  To  cure  this  problem,  the  I  command  finds 
each  node  whose  value  is  charged-X  and  changes  it 
to  charged-0,  then  runs  a  simulation  step.  If  one 
iterates  the  1  command  a  couple  times,  this  often 
leads  to  a  stable  initialized  condition  (indicated 
when  an  I  command  takes  0  events,  i.e.,  the  cir¬ 
cuit  is  stable). 

Try  it  —  if  circuit  does  not  become  stable  in  3 
or  4  tries,  this  command  is  probably  of  no  use. 

Miscellaneous  Commands 

D 

toggle  debug  switch,  useful  for  debugging  simula¬ 
tor  and/or  circuit.  If  debug  switch  is  on,  then 
during  simulation  step  each  time  a  watched  node  is 
encounted  in  some  event,  that  fact  is  indicated  to 
the  user  along  with  some  event  info.  If  a  node 
keeps  appearing  in  this  prinout,  chances  are  that 
its  value  is  oscillating.  Vice  versa,  if  your 
circuit  never  settles  (ie.,  it  oscillates)  ,  you 
can  use  the  "D"  and  "w"  commands  to  find  the 
node(s)  that  are  causing  the  problem. 

>  filename 

write  current  state  of  each  node  into  specified 
file,  useful  for  make  a  break  point  in  your  simu¬ 
lation  run.  Only  stores  values  so  isn't  really 
useful  to  "dump"  a  run  for  later  use  —  see  "<" 
command. 

<  filename 

read  from  specified  file,  reinitializing  the  value 
of  each  node  as  directed.  Note  that  network  must 
already  exist  and  be  identical  to  the  network  used 
to  create  the  dump  file  with  the  ">"  command. 

These  state  saving  commands  are  really  provided  so 
that  complicated  initializing  sequences  need  only 
be  simulated  once. 

L 

invokes  network  processor  that  finds  all  subnets 
corresponding  to  simple  logic  gates  and  converts 
them  into  form' that  allows  faster  simulation. 

Often  it  does  the  right  thing,  leading  to  a  25%  to 
50%  reduction  is  the  time  for  a  single  step.  [We 
know  of  one  case  where  the  transformation  was  not 
transparent,  so  caveat  simulee...] 
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X  ... 

call  extension  command  —  provides  for  user  exten¬ 
sions  to  simulator. 

q 

exit  to  system. 

Local  Extensions 

V  node  vector 

Define  a  vector  of  inputs  for  the  node.  The  first 
element  is  initially  set  as  the  input  for  node . 

Set  the  next  element  of  the  vector  as  the  input 
after  a  cycle. 

R  n 

Run  the  simulator  through  n  cycles.  If  n  is  not 
present  make  the  run  as  long  as  the  longest  vec¬ 
tor.  All  watch  nodes  are  reported  back  as  vec¬ 
tors. 

N 

Clear  all  previously  defined  input  vectors. 

K  nodel  vectorl  node2  vector 2  ...  nodeN  vectorN 

Define  the  clock.  Each  cycle,  nodes  1  through  N 
must  run  through  their  respective  vectors. 

SEE  ALSO 

mextra (CADI) 
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NAME 

nextra  -  Manhattan  Circuit  Extractor 
SYNOPSIS 

mextra  [-gho]  [-u  scale]  basename 
DESCRIPTION 

Mextra  reads  an  intergrated  circuit  layout  description  in 
Caltech  Intermediate  Form  (CIF)  and  creates  a  circuit 
description.  From  this  circuit  description  various  electi- 
cal  checks  can  be  done  on  your  circuit.  The  circuit 
description  is  directly  compatible  with  eslm,  moserc#  and 
powest. 

Names 

Mextra  uses  the  CIF  label  construct  to  implement  node  names 
and  attributes.  The  form  of  the  CIF  label  command  is  as 
follows : 

94  name  x  ^  [layer] ; 

This  command  attaches  the  label  to  the  mask  geometry  on  the 
specified  layer  crossing  the  point  (x,  y) .  If  no  layer  is 
present  then  any  geometry  crossing  the  point  is  given  the 
label.  Mextra  does  not  recognize  the  CIF  user  extension  "0* 
which  is  used  by  MIT  and  Lincoln  Labs  programs  (eg.  mac- 
pitts)  to  indicate  node  labels. 

Mextra  interprets  these  labels  as  node  names.  These  names 
are  used  to  describe  the  extracted  circuit.  When  no  name  is 
given  to  a  node,  a  number  is  assigned  tc<  the  node.  A  label 
may  contain  any  ASCII  character  except  space,  tab,  newline, 
double  quota,  comma,  semi-colon,  and  parenthesis.  To  avoid 
conflict  with  extractor  generated  names,  names  should  not  be 
numbers  or  end  in  'In'  where  n  is  a  number. 

A  problem  arises  when  two  nodes  are  given  the  same  name 
although  they  are  not  connected  electrically.  Sometimes  we 
want  these  nodes  to  have  the  same  names,  other  times  we 
don't.  This  frequently  happens  when  a  name  is  specified  in 
a  cell  which  is  repeated  many  times.  For  instance,  if  we 
define  a  shift  register  cell  with  the  input  marked  'SR. in' 
then  when  we  create  an  8  bit  shift  register  we  could  have  8 
nodes  names  'SR. in'.  If  this  happens  it  would  appear  as 
though  all  8  of  the  shift  register  cells  were  shorted 
together.  To  resolve  this  the  extractor  recognizes  three 
different  types  of  names:  local,  global ,  and  unspecified. 

Any  time  a  local  name  appears  on  more  than  one  node  it  is 
appended  with  a  unique  suffix  of  the  form  'In'  where  n  is  a 
number.  The  numbers  are  assigned  in  scanline  order  and 
starting  at  0.  In  the  shift  register  example,  the  names 
would  be  'SR.in#0'  through  'SR.tn#7'.  Global  names  do  not 
have  suffixes  appended  to  them.  Thus  unconnected  nodes  with 
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global  names  will  appear  connected  after  extraction.  (The 
-g  causes  the  extractor  to  append  unique  suffixes  to  uncon¬ 
nected  nodes  with  the  same  global  name.)  Names  are  made 
local  by  ending  them  with  a  sharp  sign,  '#'.  Names  are  glo¬ 
bal  if  they  end  with  an  exclamation  mark,  These  ter¬ 

minating  characters  are  not  considered  part  of  the  name, 
however.  Names  which  do  not  end  with  these  characters  are 
considered  unspecified.  Unspecified  names  are  treated  simi¬ 
lar  to  locals.  Multiple  occurrences  are  appended  with 
unique  suffixes.  By  convention,  unspecified  names  signify 
the  designer's  intention  that  this  name  is  a  local  name,  but 
is  connected  to  only  one  node.  It  is  illegal  to  have  a  name 
that  is  declared  two  different  types.  The  extractor  will 
complain  if  this  is  so  and  make  the  name  local. 

Optionally  mextra  will  expand  local  and  unspecified  node 
names  with  the  path  name  of  the  symbol  instances  through 
which  they  were  called.  By  using  the  -h  option  mextra  will 
produce  node  names  of  the  form: 

/cal 11 /call 2 / . . . /callM/node-name 
where  callN  is  the  name  of  the  symbol  instance  which  con¬ 
tains  the  label  node-name ,  callN-1  is  the  name  of  the 
instance  which  contains  cal IN,  and  so  on.  Named  symbol 
instances  take  the  following  form  in  CIF: 

91  name;  C  number  la  b] ; 

Unnamed  CIF  calls  are  assigned  names  of  the  form  '#n',  where 
n  is  a  number. 

It  makes  no  difference  to  the  extractor  if  the  same  name  is 
attached  to  the  same  node  several  times.  However,  if  more 
than  one  name  is  given  to  a  node  then  the  extractor  must 
choose  which  name  it  will  use.  Whenever  two  names  are  given 
to  the  same  node  the  extractor  will  assign  the  name  with  the 
highest  type  priority,  global  being  the  highest,  unspecified 
next,  local  lowest.  If  the  names  are  the  same  type  then  the 
extractor  takes  the  one  with  the  fewest  slashes (’/') ;  if  the 
number  of  slashes  is  equal,  the  shortest  name  is  taken. 

This  causes  the  name  highest  up  in  the  symbol  hierarchy  to 
be  taken  when  hierarchical  names  are  expanded.  At  the  end 
of  the  log  file  the  extractor  lists  nodes  with  more  than  one 
name  attached.  These  lines  start  with  an  equal  sign  and  are 
readable  by  esim  so  that  it  will  understand  these  aliases. 

Attributes 

In  addtion  to  naming  nodes  mextra  allows  you  to  attach 
attributes  to  nodes.  There  are  two  types  of  attributes, 
node  attributes,  and  transistor  attributes.  A  node  attri- 
bute  is  attached  to  a  node  using  the  CIF  94  construct,  just 
the  same  way  as  a  node  name.  The  node  attribute  must  end  in 
an  at-sign,  '§'.  More  than  one  attribute  may  be  attached  to 
a  node.  Mextra  does  not  interpret  these  attributes  other 
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than  to  eliminate  duplicates.  For  each  attribute  attached 
to  a  node  there  appears  a  line  in  the  .sim  file  in  the  fol¬ 
lowing  form: 

A  node  attribute 

Node  is  the  node  name,  and  attribute  is  the  attribute 
attached  to  that  node  with  the  at-sign  removed. 

Transistor  attributes  can  be  attached  to  the  gate,  source, 
or  drain  of  a  transistor.  Transistor  attributes  must  end  in 
a  dollar  sign,  '$'.  To  attach  an  attribute  to  a  transistor 
gate  the  label  must  be  placed  inside  the  transistor  gate 
region.  To  attach  an  attribute  to  a  source  or  drain  of  a 
transistor  the  label  must  be  placed  on  the  source  or  drain 
edge  of  a  transistor.  Transistor  attributes  are  recorded  in 
the  transistor  record  in  the  .sim  file.  A  transistor 
description  has  the  following  form: 

type  gate  source  drain  _1  w  x.  Y  ^“attributes 
s«attri5utes  ^attributes"  ” 

Attributes  is  a  comma-separated  list  of  attributes.  If  no 
attribute- is  present  for  the  gate,  source,  or  drain  the  g», 
sa,  or  d“  fields  may  be  omitted. 

Capacitance 

The  .sim  file  also  has  Information  about  capacitance  in  the 
circuit.  The  lines  containing  capacitance  information  are 
of  the  form: 

C  nodel  node2  cap-value 

cap-value  is  the  capacitance  betweens  the  nodes  in  femto- 
farads.  Capacitance  values  below  a  certain  threshold  are 
not  reported.  The  default  threshold  is  50  femto-farads. 

The  extractor  reports  capacitance  from  two  sources  -  capaci¬ 
tance  between  node  and  substrate,  and  capacitance  caused  by 
poly  overlapping  diffusion  but  not  forming  a  transistor. 
Transistor  capacitances  are  not  included  since  most  of  the 
tools  that  work  on  the  .sim  file  calculate  the  transistor 
capacitance  from  the  width  and  length  information. 

The  capacitance  for  each  layer  is  calculated  separately. 

The  reported  node  capacitance  is  the  total  of  the  layer 
capacitances  of  the  node.  The  layer  capacitance  is  calcu¬ 
lated  by  taking  the  area  of  a  node  on  that  layer  and  multi¬ 
plying  it  by  a  constant.  This  is  added  to  the  product  of 
the  perimeter  and  a  constant.  The  default  constants  are 
given  below.  Area  constants  are  in  femto-farads  per  square 
micron.  Perimeter  constants  are  femto-farads  per  micron, 
layer  area  perimeter 

metal  0.03  0.0 

poly  0.05  0.0 
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diff  0.1  0.1 

poly/diff  0.4  0.0 

Poly/diffusion  capacitance  is  calculated  similar  to  layer 
capacitance.  The  area  is  multiplied  by  constant  and  this  is 
added  to  the  perimeter  multiplied  by  a  constant. 
Poly/diffusion  capacitance  is  not  threshold,  however. 

The  -o  option  supresses  the  calculation  of  capacitance,  and 
instead,  gives  for  each  node  in  the  circuit  the  area  and 
perimeter  of  that  node  on  the  diffusion,  poly,  rad  metal 
layers.  The  lines  containing  this  information  look  like 
this: 

N  node  dlf f-area  dlff-per im  poly-area  poly-perim 
metal-area  metal-perim 

Node  is  the  node  name.  Dlf f-area  through  metal-perim  are 
the  area  and  perimeter  of  the  diffusion,  poly,  and  metal 
layers  in  user  defined  units.  (In  addtion  the  -o  option 
causes  transistors  with  only  one  terminal  to  be  recorded  in 
the  .sim  file  as  a  transistor  with  source  connected  to 
drain. ) 

Setting  Options 

By  default,  mextra  reports  locations  in  CIP  units.  A  more 
convenient  form  of  units  may  be  specified  either  in  the 
'.cadre'  file  or  on  the  command  line.  The  form  of  the  com¬ 
mand  line  option  is: 

units  scale 

To  set  units  on  the  command  line  use  the  -u  option. 

The  parameters  used  to  compute  node  capacitance  may  be 
changed  by  including  the  following  commands  in  your  '.cadre' 
file. 


areatocap  layer  value 
perimtocap  layer  value 

value  is  atto-farads  per  square  micron  for  area,  and  atto- 
farads  per  micron  for  perimeter,  layer  may  be  "poly", 
"diff",  "metal",  or  "poly/diff*.  The  threshold  for  report¬ 
ing  capacitance  may  set  in  the  '  .cadre'  file  with  the  fol¬ 
lowing  line. 

capthreshold  value 

A  negative  value  sets  the  threshold  to  infinity. 
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Mextra  knows  of  two  technologies,  NMOS  and  CMOS  p-well. 
NMOS  is  assumed  by  default.  To  set  the  technology  to  CMOS 
p-well,  include  the  following  line  in  your  '.cadre'  file: 

tech  emos-pw 


FILES 

“cad/lib/extname 
“cad/lib/log 
“cad/. cadre 
“/.cadre 
/usr/trap/#mext* 

ALSO  SEE 

caesar (cadi) ,  kic(cadl),  powest (cadi) ,  cadrc(cadS) 


BUGS 

Accepts  manhattan  simple  CIF  only.  The  length/width  ratio 
for  unusually  shaped  transistors  may  be  inacurate.  Attri¬ 
butes  for  funny  transistors  are  not  recorded. 
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The  first  five  figures  shew,  in  the  order  that  the;  were 
produced,  .int  files  from  a  HacPitts  interpreter  session 
asing  the  source  file,  multipSc.mac. 

The  last  three  figures  show  the  terminal  output  produced 
ty  the  switch  level  event  siaulation  program,  esim,  oper¬ 
ating  cn  the  node  extraction  file  of  the  HacPitts  layout  for 
multip8c.  The  node  extraction  was  performed  by  the  mextra 
program. 


■raultipSe* 

MacPitts  interpreter  state  after  initial  data  entry. 

((register  al  undefined-integer) 

(register  a2  undefined-integer) 

(register  a3  undefined-integer) 

(register  a4  undef ined-integer) 

(register  hrl  undefined-integer) 

(register  lrl  undefined-integer) 

(register  hr2  undefined-integer) 

(register  lr2  undefined-integer) 

(register  hr3  undefined-integer) 

(register  lr3  undefined-integer) 

(register  hr4  undefined-integer) 

(register  lr4  .undef ined-integer) 

(port  ain  104  console) 

(port  bin  22  console) 

(port  bin  0  console) 

(port  aout  undefined-integer  chip) 

(port  hout  undefined-integer  chip) 

(port  lout  undefined-integer  chip)) 


Figure  D.  1  Hacpitts  Interpreter  Results. 
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"multipSc" 

MacPitts  interpreter  state  after  1  clock  cycle. 

((register  al  104) 

(register  a2  undefined-integer) 

(register  a3  undefined-integer) 

(register  a4  undefined-integer) 

(register  hrl  0) 

(register  Irl  11) 

(register  hr2  undefined-integer) 

(register  lr2  undefined-integer) 

(register  hr3  undefined-integer) 

(register  lr3  undefined-integer) 

(register  hr4  undefined-integer) 

(register  lr4  undefined-integer) 

(port  ain  104  console) 

(port  bin  22  console) 

(port  hin  0  console) 

(port  aout  undefined-integer  chip) 

(port  hout  undefined-integer  chip) 

(port  lout  undefined-integer  chip)) 


•multipSc* 

Macpltts  interpreter  state  after  2  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  undefined-integer) 

(register  a4  undefined-integer) 

(register  hrl  0) 

(register  lrl  11) 

(register  hr2  52) 

(register  lr2  5) 

(register  hr3  undefined-integer) 

(register  lr3  undefined-integer) 

(register  hr4  undefined-integer) 

(register  lr4  undefined-integer) 

(port  ain  104  console) 

(port  bin  22  console) 

(port  hin  0  console) 

(port  aout  undefined-integer  chip) 

(port  hout  undefined-integer  chip) 

(port  lout  undefined-integer  chip)) 


Figure  0.2  HacPitts  Interpreter  Besults,  (continued) 


•eultip8c* 

Maepl tea  interpreter  itte  altar  3  clock  cycles 


((register  el  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  undefined-integer) 
(register  hrl  0) 

(register  lrl  11) 

(register  hr2  52) 

(register  lr2  5) 

(register  hr3  78) 

(register  lr3  2) 

(register  hr4  undefined-integer) 
(register  lr4  undefined-integer) 
(port  ain  104  console) 

(port  bin  22  console) 

(port  hi n  0  console) 

(port  aout  undefined-integer  chip) 
(port  hout  undefined-integer  chip) 
(port  lout  undefined-integer  chip)) 
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"raultipSc* 

Macpitts  interpreter  state  after  4  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  0) 

(register  lrl  11) 

(register  hr2  52) 

(register  lr2  5) 

(register  hr3  78) 

(register  lr3  2) 

(register  hr4  39) 

(register  lr4  1) 

(port  ain  104  console) 

(port  bin  22  console) 

(port  hin  0  console) 

(port  aout  104  chip) 

(port  hout  39  chip) 

(port  lout  1  chip)) 


Figure  0.3  HacPitts  Interpreter  Hesults,  (Continued) 
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*multip8c* 

MacPitts  interpreter  state  after  4  clock  cycles  and 
resetting  the  input  ports  to  the  values  of  the  output  ports. 
This  simulates  a  second  chip  in  cascade  with  the  first. 


( (register 
(register 
( register 
(register 
(register 
(register 
(register 
(register 
( register 
(register 
(register 
(register 
(port  ain 
(port 
(port 
(port 
(port 
(port 


al  104) 
a2  104) 
a3  104) 
a4  104) 


hrl 

lrl 

hr2 

lr2 

hr3 

lr3 

hr4 

lr4 

104 


bin 

hin 

aout 

hout 


0) 

11) 

52) 

5) 

78) 

2) 

39) 

1) 

console) 
1  console) 

39  console) 
104  chip) 

39  chip) 


lout  1  chip)) 


•multip8c* 

Kacpitts  Interpreter  state  after  5  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  71) 

(register  lrl  128) 

(register  hr2  52) 

(register  lr2  5) 

(register  hr3  78) 

(register  lr3  2) 

(register  hr4  39) 

(register  lr4  1) 

(port  ain  104  console) 

(port  bin  1  console) 

(port  hin  39  console) 

(port  aout  104  chip) 

(port  hout  39  chip) 

(port  lout  1  chip)) 


Figure  D. 4  HacPitts  Interpreter  Basalts,  (Continued). 
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*BUltlp8c* 

Macpltts  Interprater  state  after  6  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  71) 

(register  Irl  128) 

(register  hr2  35) 

(register  lr2  192) 

(register  hr3  78) 

(register  lr3  2) 

(register  hr4  39) 

(register  lr4  1) 

(port  ain  104  console) 

(port  bin  1  console) 

(port  hin  39  console) 

(port  aout  104  chip) 

(port  hout  39  chip) 

(port  lout  1  chip) ) 


"multip8c* 

Macpltts  interpreter  state  after  7  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  71) 

(register  Irl  128) 

(register  hr2  35) 

(register  lr2  192) 

(register  hr3  17) 

(register  lr3  224) 

(register  hr4  39) 

(register  lr4  1) 

(port  ain  104  console) 

(port  bin  1  console) 

(port  hin  39  console) 

(port  aout  104  chip) 

(port  hout  39  chip) 

(port  lout  1  chip)) 


Figure  Dm 5  BacPitta  Interpreter  Results,  (Continued). 
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■aultipSe* 

Macpitts  interpreter  state  after  8  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  71) 

(register  Irl  128) 

(register  hr2  35) 

(register  lr2  192) 

(register  hr3  17) 

(register  lr3  224) 

(register  hr4  8) 

(register  lr4  240) 

(port  ain  104  console) 

(port  bin  1  console) 

(port  bin  39  console) 

(port  aout  104  chip) 

(port  hout  8  chip) 

(port  lout  240  chip)) 


•«ultip8c" 

MacPitts  interpreter  state  after  9  clock  cycles. 

((register  al  104) 

(register  a2  104) 

(register  a3  104) 

(register  a4  104) 

(register  hrl  71) 

(register  lrl  128) 

(register  hr2  35) 

(register  lr2  192) 

(register  hr3  17) 

(register  lr3  224) 

(register  hr4  8) 

(register  lr4  240) 

(port  ain  104  console) 

(port  bin  1  console) 

(port  hin  39  console) 

(port  aout  104  chip) 

(port  hout  8  chip) 

(port  lout  240  chip)) 


Figure  D.6  BacPitt*  Interpreter  Besalts,  (Continued). 


127 


%  esira  mul8c.sim  raul8c. macro 
1612  transistors,  1398  nodss  (801  pulled  up) 
1612  transistors,  1398  nodes  (801  pulled  up) 
slm>  s 

step  took  605  events 
cloek-XXX 


aout-XXXXXXXX 

lout-XXXXXXXX 

hout-XXXXXXXX 

hin-00000000 

bin-00010110 

ain-01101000 

sim>  I 

initialization 
slm>  I 

initialization 


0 

22 

104 

took  2119  steps 
took  0  steps 


sim>  s 

step  took  0  events 


clock-000 

0 

aout-11111111 

255 

lout-11111111 

255 

hout-llllllll 

255 

hin-00000000 

0 

bin-00010110 

22 

aln-01101000 

104 

slm>  c 

clock-101 

5 

aout-11111111 

255 

lout-01111111 

127 

hout-01111111 

127 

hin-00000000 

0 

bin-00010110 

22 

ain-01101000 

104 

cycle  took  1433 

events 

sim>  c 

clock-101 

5 

aout-11111111 

255 

lout-00111111 

63 

hout-00111111 

63 

hin-00000000 

0 

bin-00010110 

22 

ain-01101000 

104 

cycle  took  1210 

events 

Last  line  is  repeated  at  top  of  following  page. 


figure  D.7  Event  Simulation  Heaults. 
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cycle  took  1210  events 
sim>  c 

clock-101 

S 

aout-11111111 

255 

lout-00011111 

31 

hout-00011111 

31 

hin-00000000 

0 

bin-00010110 

22 

ein-01101000 

104 

cycle  took  1231  events 
sim>  c 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00000000 

0 

bin-00010110 

22 

ain-01101000 

104 

cycle  took  1139  events 
sin>  c 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00000000 

0 

bin-00010110 

22 

ain-01101000 

104 

cycle  took  1052  events 

slm>  9  nul8c.nacro2 
sin>  s 

step  took  177 

events 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 
sim>  e 

104 

Last  line  is  repeated  at  top  of  following  page. 


fig are  0.8  Sweat  Simulation  Results#  (Continued) . 
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sin>  c 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 

104 

cycle  took  1164 
sim>  c 

events 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 

104 

cycle  took  1154 
sim>  c 

events 

clock-101 

5 

aout-01101000 

104 

lout-00000001 

1 

hout-00100111 

39 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 

104 

cycle  took  1131 
sim>  c 

events 

clock-101 

5 

aout-01101000 

104 

lout-11110000 

240 

hout-00001000 

8 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 

104 

cycle  took  1123 
sim>  c 

events 

clock-101 

5 

aout-01101000 

104 

•  lout-11110000 

240 

hout-00001000 

8 

hin-00100111 

39 

bin-00000001 

1 

ain-01101000 

104 

cycle  took  1052 
sim>  q 

« 

events 

1 


Jigura  0.9  E*aat  Simulation  Basalts,  (Continned)  . 
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